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Chapter 1 
Multi-variable calculus 

see Kaplan, Chapter 2: 2.1-2.22, Chapter 3: 3.9, 

Here we consider many fundamental notions from the calculus of many variables. 

1.1 Implicit functions 

The implicit function theorem is as follows: 

Theorem 

For a given f(x,y) with / = and df/dy ^ at the point (x ,y ), there corresponds a 
unique function y(x) in the neighborhood of (x ,y ). 

More generally, we can think of a relation such as 

f(x 1 ,x 2 ,...,x N ,y) = 0, (1.1) 

also written as 

f(x n ,y) = 0, n=l,2,...,N, (1.2) 

in some region as an implicit function of y with respect to the other variables. We cannot 
have df/dy = 0, because then / would not depend on y in this region. In principle, we can 
write 

y = y(x 1 ,x 2 ,...,x N ), or y = y{x n ), n = l,...,N, (1.3) 

if df/dy ± 0. 

The derivative dy/dx n can be determined from / = without explicitly solving for y. 
First, from the definition of the total derivative, we have 

df df df df df 

df = T—dx! + — - dx 2 + • • • + - — dx n + . . . + - — dx N + —dy = 0. (1.4) 

ox\ dx 2 dx n dxN dy 

Differentiating with respect to x n while holding all the other x m ,m ^ n, constant, we get 

df , df dy 



dx n dy dx n 
13 



0, (1.5) 
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so that 

dx~ = ~^ (L6) 

which can be found if df/dy ^ 0. That is to say, y can be considered a function of x n if 
df/dy + 0. 

Let us now consider the equations 

f(x,y,u,v) = 0, (1.7) 

g(x,y,u,v) = 0. (1.8) 

Under certain circumstances, we can unravel Eqs. (jl.7til.8D . either algebraically or numeri- 
cally, to form u = u(x, y), v = v(x, y). The conditions for the existence of such a functional 
dependency can be found by differentiation of the original equations; for example, differen- 
tiating Eq. (11.7ft gives 

df = ^fdx + ^-dy + ^fdu+^-dv = 0. (1.9) 

ox dy du ov 



Holding y constant and dividing by dx, we get 

df df du df dv 
dx du dx dv dx 

Operating on Eq. (jl.8p in the same manner, we get 



(1.10) 



dx du dx dv dx 
Similarly, holding x constant and dividing by dy, we get 

df df du df dv 

dy du dy dv dy 

dg dg du dg dv 

dy du dy dv dy 



do do du do dv n _ _, . 

-7T- + -T^T- + T^TT- = 0. 1.11 



0, (1.12) 

0. (1.13) 



Equations ( 11.10111.111) can be solved for du/dx and dv/dx, and Eqs. ( Il.12lll.13p can be solved 
for du/dy and dv/dy by using the well known Cramer 'd_| rule; see Eq. ( |8.93|) . To solve for 
du/dx and dv/dx, we first write Eqs. ( Il.10fl.lip in matrix form: 



:i.i4) 



df 


d.f\ 


/ du\ 


/ df 


du 


dv \ 


l a^ l 


- 1 dx 


'■>!l 


dg 


[dv - 


' 1 dg 


du 


dv / 


\ dx / 


\ dx 



1 Gabriel Cramer, 1704-1752, well-traveled Swiss-born mathematician who did enunciate his well known 
rule, but was not the first to do so. 
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Thus, from Cramer's rule we have 



du 
dx 



df 


df 


dx 


dv 


dg 


dg 


dx 


dv 


df 


df 


du 


dv 


dg 


dg 


du 


dv 



d{f,g) 
d(x,v) 

~diKg) 

d(u,v) 



dv 

dx 



df 


df 


du 


dx 


dg 


dg 


du 


dx 


df 


df 


du 


dv 


dg 


dg 


du 


dv 



d(f,g) 
d(u,x) 

~dJKgJ 

d(u,v) 



In a similar fashion, we can form expressions for du/dy and dv/dy: 



du 
dy 



df 


df 


dy 


dv 


dg 


dg 


dy 


dv 


df 


df 


du 


dv 


dg 


dg 


du 


dv 



d(f,g) 
d(y,v) 

d(f,g) 
d(u,v) 



dv 

dy 



df 


df 


du 


dy 


dg 


dg 


du 


9^ 


df 


df' 


du 


dv 


dg 


dg 


du 


dv 



d{f,g) 

d(u,y) 

d{f,g) • 
d(u,v) 



Here we take the JacobiarQ matrix J of the transformation to be defined as 



df df 



Lin 
du 



illL 
dv 



This is distinguished from the Jacobian determinant, J, defined as 



J = det J 



d(f, g) 

d(u,v) 



df df 



du 



'211 
dv 



(1.15) 



(1.16) 



(1.17) 



(1.18) 



If J t^ 0, the derivatives exist, and we indeed can form u(x,y) and v(x,y). This is the 
condition for existence of implicit to explicit function conversion. 



I 

Example 1.1 
If 



xy + uv = 1, 



(1.19) 
(1.20) 



find du/ dx. 



Note that we have four unknowns in two equations. In principle we could solve for u(x,y) and 
v(x, y) and then determine all partial derivatives, such as the one desired. In practice this is not always 
possible; for example, there is no general solution to sixth order polynomial equations such as we have 
here. 

Equations (|1.19ll.20[) are rewritten as 



f(x,y,u,v) 
g(x,y,u,v) 



x + y + u+u + v = 0, 
xy + uv — 1 = 0. 



(1.21) 
(1.22) 



HCarl Gustav J acob Jacobi, 1804-1851, German/Prussian mathematician who used these quantities, 
which were first studied by Cauchy, in his work on partial differential equations. 
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Using the formula from Eq. (|1.15[) to solve for the desired derivative, we get 



On 
dx 



df 


<>/ 




dx 


i:)i: 




dg 


».'/ 




dx 


dv 




df 


df 




du 


dv 




dg 


dg 




du 


dv 





(1.23) 



Substituting, we get 



du 
dx 





-1 1 






-y u 




6u 5 + 1 1 




V 


11. 



u(6u 5 + 1) 



(1.24) 



Note when 



&u° 



(1.25) 



that the relevant Jacobian determinant is zero; at such points we can determine neither du/ dx nor 
du/dy; thus, for such points we cannot form u(x,y). 

At points where the relevant Jacobian determinant d(f,g)/d(u,v) ^ (which includes nearly all of 
the (a;, y) plane), given a local value of (x,y), we can use algebra to find a corresponding u and v, which 
may be multivalued, and use the formula developed to find the local value of the partial derivative. 

I 



1.2 Functional dependence 



Let u = u(x, y) and v = v(x, y). If we can write u = g(v) or v = h(u), then u and v are said 
to be functionally dependent. If functional dependence between u and v exists, then we can 
consider f(u,v) = 0. So, 



df du df dv 

du dx dv dx 

df du df dv 

du dy dv dy 



0, 
0. 



(1.26) 
(1.27) 



In matrix form, this is 



du dv 



>hi 



<2L 
dv 



(1.28) 



Since the right hand side is zero, and we desire a non-trivial solution, the determinant of the 
coefficient matrix must be zero for functional dependency, i.e. 



du dv 



dy dy 



0. 



1.29) 
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Note, since det J = det J T , that this is equivalent to 



J 



du du 

dx dy 

dv dv 

dx dy 



d(u,v) 



0. 



d(x,y) 
That is, the Jacobian determinant J must be zero for functional dependence. 

I 

Example 1.2 

Determine if 



1.30) 



il' 



y + z, 

x + 2z 2 , 

x — Ayz — 2y 



(1.31) 
(1.32) 
(1.33) 



are functionally dependent. 

The determinant of the resulting coefficient matrix, by extension to three functions of three vari- 
ables, is 



d(u,v,w) 
d(x,y,z) 



du 


() u 


du 




du 


dv 


dw 


dx 


dy 


dz 




dx 


dx 


dx 
dw 


dv 


<)r 


dv 




du 


do 


dx 


<><J 


dz 




dy 


d</ 


dy 


dw 


> hi: 


dw 




du 


<:h> 


dw 


dx 


ih J 


dz 




dz 


dz 


dz 



1 

1 
1 Az 



1 



A{y + z) 
-Ay 

(-l)(-4j/-(-4)( 2 / + 2)) + (l)(4z), 

Ay - Ay - Az + Az, 

0. 



So, u, v, w are functionally dependent. In fact w = v — 2u 2 



(1.34) 



(1.35) 

(1.36) 

(1.37) 
(1.38) 



I 

Example 1.3 

Let 

x + y + z = 0, 
x 2 + y 2 + z 2 + 2xz = 1. 

Can x and y be considered as functions of z? 

If x = x{z) and y = y(z), then dx/dz and dy/dz must exist. If we take 

f(x,y,z) = x + y + z = 0, 

g(x,y,z) = x 2 + y 2 + z 2 + 2xz - 1 = 0, 



df = —dz + —dx + —dy 
oz ox ay 



0, 



(1.39) 
(1.40) 



(1.41) 

(1.42) 

(1.43) 
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ir dz 

dz 

d_l_ 
dz 
8g_ 
dz 



dg r dg , 

+ —dx + —dy = 
dx dy 


0. 




df dx df dy 
dx dz dy dz 


0. 




dg dx dg dy 
dx dz dy dz 


0, 




d£ d£\ / dx \ 
dx dy \ 1 dz ) _ 
dg dg J 1 dy J 
dx dy ' \ dz ' 


( 


df 
dz 
dg 
dz 



(1.44) 
(1.45) 
(1.46) 
(1.47) 



then the solution matrix (dx/dz, dy / dz) can be obtained by Cramer's rule: 



dx 

dz 



'111 
dz 



_d£ d£ 

dz dy 

_dg_ dg_ 

dz dy 

df df 

dx dy 

dg_ dg_ 

dx dy 

§1 -®1 

dx dz 

dg_ dg 

dx dz 

dx dy 
dg_ dg_ 
dx dy 



-l i 

-(22 + 2x) 2y 



1 1 

2x + 2z 2y 



-2y + 2z + 2x 
2y-2x- 2z 





L -1 




2x + 2z -(2z + 2x) 




1 1 






2x + 2z 2y 









2y-2x- 2z 



(1.48) 



(1.49) 



Note here that in the expression for dx/dz that the numerator and denominator cancel; there is no 
special condition defined by the Jacobian determinant of the denominator being zero. In the second, 
dy / dz = if y — x — z =£ 0, in which case this formula cannot give us the derivative. 

Now, in fact, it is easily shown by algebraic manipulations (which for more general functions are 
not possible) that 



x{z) 



-z± 



V2 
2 ' 



y( z ) = t— • 



(1.50) 
(1.51) 



This forms two distinct lines in x, y, z space. Note that on the lines of intersection of the two surfaces 
that J = 2y — 2x — 2z = +2V2, which is never indeterminate. 

The two original functions and their loci of intersection are plotted in Fig. 11.11 It is seen that the 
surface represented by the linear function, Eq. (|1.39[) . is a plane, and that represented by the quadratic 
function, Eq. (|1.40[) . is an open cylindrical tube. Note that planes and cylinders may or may not 
intersect. If they intersect, it is most likely that the intersection will be a closed arc. However, when 
the plane is aligned with the axis of the cylinder, the intersection will be two non-intersecting lines; 
such is the case in this example. 

Let us see how slightly altering the equation for the plane removes the degeneracy. Take now 



2 i 2 

-y +z 



Can x and y be considered as functions of z7 If x 
exist. If we take 



5x + y + z = 0, 
■ 2xz = 1. 



(1.52) 
(1.53) 



x(z) and y = y(z), then dx/dz and dy / dz must 



f(x,y,z) 
g(x,y,z) 



5x + y + z = 0, 



y 



+ 2xz- 1 = 0, 



(1.54) 
(1.55) 
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Figure 1.1: Surfaces of x + y + z = and x +y + z + 2xz = 1 , and their loci of intersection 



then the solution matrix (dx/dz, dy/dz) is found as before: 

dz dy 
_dg_ dg 

dz dy 

~HEL 5T 



dx 

dz 



'111 
dz 



dg dg 

dx dy 

Of 





-1 ] 




~{2z + 2x) 2y 




5 1 






2x + 2z 2y 





5g _dg_ 

dx dy 

dg_ dg_ 

dx dy 



5 

2x + 2z 



■1 



-{2z + 2x) 



5 1 

2x + 2z 2y 



-2y + 2z + 2x 
lOy -2x-2z' 



-8x-8z 
10y-2x-2z' 



The two original functions and their loci of intersection are plotted in Fig. 11.21 
Straightforward algebra in this case shows that an explicit dependency exists: 



x(z) 

y(z) 



-6z±y/2~Vl3-8z 2 
26 : 



-Az T 5\/2Vl3 - 8z 2 
26 ' 



(1.56) 



11-57) 



These curves represent the projection of the curve of intersection on the x, z and y, z planes 
In both cases, the projections are ellipses. 



(1.58) 

(1.59) 
respectively. 

I 



1.3 Coordinate transformations 

Many problems are formulated in three-dimensional Cartesian_l space. However, many of 
these problems, especially those involving curved geometrical bodies, are more efficiently 



3 Rene Descartes, 1596-1650, French mathematician and philosopher. 
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lAiOo.2 




Figure 1.2: Surfaces of 5x+y + z = and x +y +z +2xz = 1, and their loci of intersection. 

posed in a non- Cartesian, curvilinear coordinate system. To facilitate analysis involving 
such geometries, one needs techniques to transform from one coordinate system to another. 
For this section, we will utilize an index notation, introduced by Einsteino We will take 
untransformed Cartesian coordinates to be represented by (£ 1 ,£ 2 ,£ 3 ). Here the superscript 
is an index and does not represent a power of £. We will denote this point by £\ where 
i = 1,2,3. Because the space is Cartesian, we have the usual Euclidean_| distance from 
Pythagorasu theorem for a differential arc length ds: 



(ds) 

(ds) 



4\2 



2\ 2 



^3\2 



(dey + (dey + m 

3 

j2dede = dede. 



(1.60) 
(1.61) 



i=l 



Here we have adopted Einstein's summation convention that when an index appears twice, 
a summation from 1 to 3 is understood. Though it makes little difference here, to strictly 
adhere to the conventions of the Einstein notation, which require a balance of sub- and 
superscripts, we should more formally take 



(ds) 2 = d^SjidC = di4i\ 



1.62) 



■^Albert Einstein, 1879-1955, German/ American physicist and mathematician. 
^Euclid of Alexandria! ~ 325 B.C.-~ 265 B.C., Greek geometer. 

^Pythagoras of Samos, c. 570-c. 490 BC, Ionian Greek mathematician, philosopher, and mystic to whom 
this theorem is traditionally credited. 
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where 5ji is the Kronecker^} delta, 



S .i> 



8 Jl 



In matrix form, the Kronecker delta is simply the identity matrix I, e.g. 

1 

= 5 Jl = 5) = I == [ 1 

1 



(1.63) 



1.64) 



Now let us consider a point P whose representation in Cartesian coordinates is (^, £ 2 , £ 3 ) 
and map those coordinates so that it is now represented in a more convenient (x l ,x 2 ,x 3 ) 
space. This mapping is achieved by defining the following functional dependencies: 



X 



X 



x 



(1.65) 
(1.66) 
(1.67) 



= At 1 ,?,?), 

- x\e,e,e). 

We note that in this example we make the common presumption that the entity P is invariant 
and that it has different representations in different coordinate systems. Thus, the coordinate 
axes change, but the location of P does not. This is known as an alias transformation. This 
contrasts another common approach in which a point is represented in an original space, 
and after application of a transformation, it is again represented in the original space in an 
altered state. This is known as an alibi transformation. The alias approach transforms the 
axes; the alibi approach transforms the elements of the space. 
Taking derivatives can tell us whether the inverse exists. 



dx 1 



dx z 



dx" 



dx 




dx 1 



dx 2 

dx 3 

dt 

I 



dx 1 




dx 1 

w 

dx 2 

w 

dx 3 

w 



de 



de 



de 



dx 1 


dx 1 


ae 


ae 


dx' 2 


dx 2 


ae 


ae 


dx :i 


dx ' 


ae 


ae 



de 



(1.68) 
(1.69) 
(1.70) 

(1.71) 

(1.72) 



In order for the inverse to exist we must have a non-zero Jacobian determinant for the 
transformation, i.e. 



dit 1 ,?,?) 



7^0. 



(1.73) 



7 Leopold Kronecker, 1823-1891, German/Prussian mathematician. 
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As long as Eq. (1 1.73ft is satisfied, the inverse transformation exists: 

e = e(x\x 2 ,x 3 ), (i.74) 

£ 2 = e(x\x 2 ,x% (1.75) 

£ 3 = e(x\x 2 ,x 3 ). (1.76) 

Likewise then, 

d^ = -±-dx J . (1.77) 

dxi K ' 

1.3.1 Jacobian matrices and metric tensors 

Defining the Jacobian matrbqj J to be associated with the inverse transformation, Eq. (jl.77|) . 

we take 



dC 



ag 1 at 1 dj 1 

dx 1 dx 2 dx 3 

14 14 14 I • (1-78) 



ftrri \ dx 1 dx 2 dx 3 

1 ae ae ae 

dx 1 dx 2 dx 3 

We can then rewrite d^ 1 from Eq. (11.771) in Gibbso vector notation as 

d£ = J • dx. (1.79) 

Now for Euclidean spaces, distance must be independent of coordinate systems, so we 
require 

(ds? = dCdC = (^rdx k ) l^rdx 1 ) =dx k ^^j dx 1 . (1.80) 

\ OX k J \ ox 1 J ox k ox 1 

9kl 

In Gibbs' vector notation Eq. ( 11.801) becomea 10 ! 

(ds) 2 = d£ T -d£, (1.81) 

= (J • dx) T • (J • dx) . (1.82) 



8 The definition we adopt influences the form of many of our formulae given throughout the remainder of 
these notes. There are three obvious alternates: i) An argument can be made that a better definition of 
J would be the transpose of our Jacobian matrix: J — > J T . This is because when one considers that the 
differential operator acts first, the Jacobian matrix is really g|j-£*, and the alternative definition is more 
consistent with traditional matrix notation, which would have the first row as (g^r^ 1 , jprC 2 ? jfrS, 3 ), h) 
Many others, e.g. Kay, adopt as J the inverse of our Jacobian matrix: J — > J . This Jacobian matrix is 
thus defined in terms of the forward transformation, dx l /d^ , or iii) One could adopt J — > (J T ) . As long 
as one realizes the implications of the notation, however, the convention adopted ultimately does not matter. 

" Josiah Willard Gibbs, 1839-1903, prolific American mechanical engineer and mathematician with a life- 
time affiliation with Yale University as well as the recipient of the first American doctorate in engineering. 
Common alternate formulations of vector mechanics of non-Cartesian spaces view the Jacobian as an 
intrinsic part of the dot product and would say instead that by definition (ds) 2 = dx • dx. Such formulations 
have no need for the transpose operation, especially since they do not carry forward simply to non-Cartesian 
systems. The formulation used here has the advantage of explicitly recognizing the linear algebra operations 
necessary to form the scalar ds. These same alternate notations reserve the dot product for that between 
a vector and a vector and would hold instead that d£ = Jrfx. However, this could be confused with raising 
the dimension of the quantity of interest; whereas we use the dot to lower the dimension. 
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(1.85) 


(1.86) 


(1.87) 
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Now, it can be shown that (J • dx) T = dx. T ■ J T (see also Sec. 18.2.3.50 . so 

{ds) 2 = dx T • J T • J .<hc. (1.83) 

G 

If we define the metric tensor, g k i or G, as follows: 

9kl ~7\ l7 7) 

ax K ox 1 
G = J T J, 

then we have, equivalently in both Einstein and Gibbs notations, 

[ds) 2 = dx k g kl dx\ 
(ds) = dx T ■ G • dx. 

Note that in Einstein notation, one can loosely imagine super-scripted terms in a denominator 
as being sub-scripted terms in a corresponding numerator. Now g k i can be represented as a 
matrix. If we define 

g = detg kh (1.88) 

it can be shown that the ratio of volumes of differential elements in one space to that of the 
other is given by 

d^ 1 d£ 2 df = yfg dx 1 dx 2 dx 3 . (1.89) 

Thus, transformations for which g = 1 are volume-preserving. Volume-preserving trans- 
formations also have J = det J = ±1. It can also be shown that if J = det J > 0, the 
transformation is locally orientation-preserving. If J = det J < 0, the transformation is 
orientation-reversing, and thus involves a reflection. So, if J = det J = 1, the transformation 
is volume- and orientation-preserving. 

We also require dependent variables and all derivatives to take on the same values at 
corresponding points in each space, e.g. if <f> (0 = /(£*, £ 2 , £ 3 ) = ^(x 1 , x 2 , x 3 )) is a dependent 
variable defined at (£ 1 , £ 2 , £ 3 ), and (^ 1 , £ 2 , £ 3 ) maps into (x 1 , x 2 , x 3 ), we require /(f 1 , £ 2 , £ 3 ) = 
h(x 1 ,x 2 ,x 3 ). The chain rule lets us transform derivatives to other spaces: 

dx dx dx 

d£ 2 de ae \ (-i nn\ 

a^ dx* &3 I ' {i-.vuj 

ae ae ae 

dx 1 dx 2 dx 3 



d(t>d£ j 



(1.91) 



dx 1 d& dx 1 

Equation ( 11.911) can also be inverted, given that g ^ 0, to find (defr/d!; 1 , dcf)/d^ 2 , d(j)/d^ 3 ). 
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Employing Gibbs notatioro we can write Eq. (11.911) as 



V 



T , 



V^>- J. 



(1.92) 



The fact that the gradient operator required the use of row vectors in conjunction with the 
Jacobian matrix, while the transformation of distance, earlier in this section, Eq. (11.79J) . 
required the use of column vectors is of fundamental importance, and will be soon exam- 
ined further in Sec. 11.3.21 where we distinguish between what are known as covariant and 
contravariant vectors. 

Transposing both sides of Eq. (11.92)) . we could also say 



V, 



\T 



V, 



(1.93) 



Inverting, we then have 



V^=(J J )" i -V x 0. 
Thus, in general, we could say for the gradient operator 



(1.94) 



V*=(J 



T\-l 



V, 



(1.95) 



Contrasting Eq. (I1.95P with Eq. (11.791) . d£ = J • dx, we see the gradient operation transforms 
in a fundamentally different way than the differential operation d, unless we restrict attention 
to an unusual J, one whose transpose is equal to its inverse. We will sometimes make this 
restriction, and sometimes not. When we choose such a special J, there will be many 
additional simplifications in the analysis; these are realized because it will be seen for many 
such transformations that nearly all of the original Cartesian character will be retained, 
albeit in a rotated, but otherwise undeformed, coordinate system. We shall later identify a 
matrix whose transpose is equal to its inverse as an orthogonal matrix, Q: Q T = Q _1 and 
study it in detail in Sees. 16.2.11 18.61 

One can also show the relation between d^/dx^ and dx 1 /<9£ J to be 



dxi 




dx j 




/ dx^_ dx^_ dx^_ \ -1 

/ oe ae ae » 

dx 2 dx 2 dx 2 

ae at, 2 ae 

i dx 3 dx 3 dx 3 I 

\ ae ae at 3 / 



(1.96) 



(1.97) 



ii 



In Cartesian coordinates, we take V^ 



d 
W 

■op- I . This gives rise to the natural, albeit unconventional, 

notation VT = ( -qft qtt q73 ) . This notion does not extend easily to non-Cartesian systems, for which 
index notation is preferred. Here, for convenience, we will take Vj 
column version for V x . 



), and a similar 
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Thus, the Jacobian matrix J of the transformation is simply the inverse of the Jacobian ma- 
trix of the inverse transformation. Note that in the very special case for which the transpose 
is the inverse, that we can replace the inverse by the transpose. Note that the transpose of 
the transpose is the original matrix and determines that d^/dx^ = dx l /d£ J . This allows the 
% to remain "upstairs" and the j to remain "downstairs." Such a transformation will be seen 
to be a pure rotation or reflection. 



I 

Example 1.4 

Transform the Cartesian equation 



#+!£-(« 2 + (o 2 . a*) 



under the following: 

1. Cartesian to linearly homogeneous affine coordinates. 

Consider the following linear non-orthogonal transformation: 



x 

x 2 



\e + \e, (1.99) 

-\e+\e> (i-ioo) 

x 3 = <e 3 . (1.101) 

This transformation is of the class of affine transformations, which are of the form 

x i = A^ j + b i , (1.102) 

where A* and b % are constants. Affine transformations for which b l = are further distinguished 
as linear homogeneous transformations. The transformation of this example is both affine and linear 
homogeneous. 

Equations I|1.99I1.101|I form a linear system of three equations in three unknowns; using standard 
techniques of linear algebra allows us to solve for ^ 1 ,^ 2 ,^ 3 in terms of x ,x ,X , that is, we find the 
inverse transformation, which is 

(1.103) 

(1.104) 
(1.105) 

Lines of constant x 1 and x 2 in the ^ 1 ,^ 2 plane as well as lines of constant £* and £ 2 in the x , x 
plane are plotted in Fig. 11.31 Also shown is a unit square in the Cartesian S, 1 ,^, 2 plane, with vertices 
A, B, C, D. The image of this rectangle is plotted as a parallelogram in the x 1 , x 2 plane. It is seen the 
orientation has been preserved in what amounts to a clockwise rotation accompanied by stretching; 
moreover, the area (and thus the volume in three dimensions) has been decreased. 

The appropriate Jacobian matrix for the inverse transformation is 



e 


= 


1 1 2 

-x 1 - x z 
2 


e 


= 


x 1 +x 2 , 


e 


= 


x 3 . 



dxi 



de 


ee 


de 


c)x L 


i):r- 


dx A 


(K- 


9f 


di 1 


ilr 1 


():/■- 


c)x :i 


ee' 


oe 


se 


cte 1 


dx 2 


dx.'* 


1 
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-1 0" 


\ 


1 




1 
1 


)■ 



(1.106) 

(1.107) 
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i o - 




Figure 1.3: Lines of constant x 1 and x 2 in the ^ x ,^ 2 plane and lines of constant ^ and £ 2 in 
the x l ,x 2 plane for the homogeneous afhne transformation of example problem. 



The Jacobian determinant is 

J = det J 



(1) 



§)(D- (-1X1)) 4 



(1.108) 



So a unique transformation, £ = J ■ x, always exists, since the Jacobian determinant is never zero. 
Inversion gives x = J -1 • £. Since J > 0, the transformation preserves the orientation of geometric 
entities. Since J > 1, a unit volume element in £ space is larger than its image in x space. 
The metric tensor is 



dx k dx 1 dx k dx l dx k dx l dx k dx 1 
For example for k = 1, 1 = 1 we get 

dC dC _ d^ 1 d^ 1 d£, 2 d£ 2 d£ 3 d£ 3 
dx l dx l dx l dx 1 dx 1 dx 1 dx 1 dx 1 ' 

m = Q)Q)+(D(i) + (o)(o) = J. 

Repeating this operation for all terms of g^i , we find the complete metric tensor is 



9kl 



5 1 
4 2 

1 2 







1 

det^ = (1) 



(2) 



This is equivalent to the calculation in Gibbs notation: 



G 



J T J, 



(1.109) 

(1.110) 
(1.111) 

(1.112) 
(1.113) 

(1.114) 
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G = -110-1 1 , (1.115) 




G = i 2 . (1.116) 
\0 1/ 

Distance in the transformed system is given by 

(dsf = dx k g k i dx 1 , (1-117) 

(dsf = dx T -G-dx, (1.118) 

{dsf = (dx 1 dx 2 dx 3 ) \\ 2 dx 2 , (1.119) 




(dsf = ((\dx 1 + \dx 2 ) (\dx 1 + 2dx 2 ) dx 3 ) \ dx 2 \=dx l dx l , (1.120) 



=dxi=dx k g kl 




—dx L 



(dsf = ^ (dx 1 ) 2 + 2 (dx 2 f + (dx 3 f + dx 1 dx 2 . (1.121) 

Detailed algebraic manipulation employing the so-called method of quadratic forms, to be discussed in 
Sec. 18. 121 reveals that the previous equation can be rewritten as follows: 

(dsf = — (dx 1 + 2dx 2 ) 2 + -(-2dx 1 + dx 2 f + (dx 3 ) 2 . (1.122) 

20 5 

Direct expansion reveals the two forms for (ds) 2 to be identical. Note: 

• The Jacobian matrix J is not symmetric. 

• The metric tensor G = J T • J is symmetric. 

• The fact that the metric tensor has non-zero off-diagonal elements is a consequence of the transfor- 
mation being non-orthogonal. 

• We identify here a new representation of the differential distance vector in the transformed space: 
dxi = dx k gki whose significance will soon be discussed in Sec. 11.3.21 

• The distance is guaranteed to be positive. This will be true for all affine transformations in ordinary 
three-dimensional Euclidean space. In the generalized space-time continuum suggested by the theory 
of relativity, the generalized distance may in fact be negative; this generalized distance ds for an 
infinitesimal change in space and time is given by ds 2 = (d^ 1 ) + (d£ 2 ) + (d£ 3 ) — (d£ 4 ) , where the 

first three coordinates are the ordinary Cartesian space coordinates and the fourth is (d£ A ) = (c dt) , 
where c is the speed of light. 

Also we have the volume ratio of differential elements as 

d^ 1 d£ 2 df = J- dx 1 dx 2 dx 3 , (1.123) 

3 

= - dx 1 dx 2 dx 3 . (1.124) 

Now we use Eq. (jl.94|) to find the appropriate derivatives of 4>. We first note that 

(F)- l =\ -110] =(\ J Oj. (1.125) 
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So 




Thus, by inspection, 
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w 

dcj) 



dx 1 \ 


/ dx 1 
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dx 2 


dx 3 \ 
di 1 \ 


94> 

dx 2 


_ dx 1 
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dx 2 
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dx 3 

d£ 2 


a<t> 1 
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dx 2 
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ae ' 



2 dcj) 
39?" 
2 dcj) 
39?" 



So the transformed version of Eq. (|1 .98[) becomes 



2 dcf> 

3 9?" 



2 d(f> 
3& 1 



2 d<f> I dcj) 

3 dx 1 3 dx 2 
4 dcj) 1 dcj) 
3 9a; 1 3 dx 2 



(J T )" 



2 dcj) 

1 dcj) 
39^' 



00 



2 X 



2 ) 2 +(^ 



^(?) 2 + x 1 a ; 2 + 2(x 2 ) 2 . 



(1.126) 



(1.127) 
(1.128) 

(1.129) 
(1.130) 



2. Cartesian to cylindrical coordinates. 
The transformations are 



x 1 = ±^(er + (ey, 

x 2 = tan" 1 ( 7T j , 
a; 3 = £ 3 . 



(1.131) 
(1.132) 
(1.133) 



Here we have taken the unusual step of admitting negative x . This is admissible mathematically, but 
does not make sense according to our geometric intuition as it corresponds to a negative radius. Note 
further that this system of equations is non-linear, and that the transformation as defined is non-unique. 
For such systems, we cannot always find an explicit algebraic expression for the inverse transformation. 
In this case, some straightforward algebraic and trigonometric manipulation reveals that we can find 
an explicit representation of the inverse transformation, which is 

(1.134) 
(1.135) 
(1.136) 

Lines of constant x 1 and x 2 in the C 1 ,^ 2 plane and lines of constant £* and £ 2 in the x l ,x 2 plane are 
plotted in Fig. 11.41 Notice that the lines of constant x 1 are orthogonal to lines of constant x 2 in the 
Cartesian £ x ,£ 2 plane; the analog holds for the x l ,x 2 plane. For general transformations, this will not 
be the case. Also note that a square of area 1/2 x 1/2 is marked in the ^,^ 2 plane. Its image in 
the x^x 2 plane is also indicated. The non- uniqueness of the mapping from one plane to the other is 
evident. 

The appropriate Jacobian matrix for the inverse transformation is 



e 


= 


x cosx , 


e 


= 


x sinx , 


e 


= 


x 3 . 



9f 
9? 



§e_ ee ae 

dx 1 dx 2 dx 3 

d£_ d£_ d£ 2 

dx 1 dx 2 dx 3 

ae 3 ae 3 ae 3 

dx 1 dx 2 dx 3 



(1.137) 



ICC BY-NC-THJ} 29 July 2012, Sen & Powers. 



1.3. COORDINATE TRANSFORMATIONS 



29 








■!M)' B l_ 


A 


'S^O' 


\ 




:^~p4 


i^v?/ ?' 


fl 


| 2 =l/2 


IV2 - 


2 


^?%-2 


^J&xli \ A 





/ ?»<l/2 


y^2 : 




n? 








trcs 




— =_s 








PU_ 




S 2 =-2 


~~rs^ y a 

I^W2 \ I 2 








^2. - 

y^2 


n 




« o A 










; a^2 


— ¥<!/2 / A 





IB ? 


r^o. 




^^? 


A I 2 







"o^i 






__^=s1/2 \ 


A 


5 ; ^l/2 






i ^ 


n -' n ni 






Zn 




___J_V 








tu 


2 


: ji^ 


?*W2 / 

ex v 


A 

:0 


I ^aT~~ 


P*=2- 




"^S 1 ? 2 


y\ s 2 


4) 

A 


s 2 =-l/2 


|'=-2 ; 




" 


,E?=0 BW 


A 


,i; 2 =o, , 


: . 



Figure 1.4: Lines of constant x 1 and x 2 in the ^ 1 ,^ 2 plane and lines of constant ^ and £ 2 in 
the x l ,x 2 plane for cylindrical coordinates transformation of example problem. 



(1.138) 



COS IE 2 


—x 1 sin a; 2 





since 2 


X 1 COS.T 2 











1 



The Jacobian determinant is 



t 1 2 2,1-22 

J = x cos x + x sin x 



(1.139) 



So a unique transformation fails to exist when x 1 = 0. For x 1 > 0, the transformation is orientation- 
preserving. For x = 1, the transformation is volume-preserving. For x 1 < 0, the transformation is 
orientation-reversing. This is a fundamental mathematical reason why we do not consider negative 
radius. It fails to preserve the orientation of a mapped element. For x 1 G (0, 1), a unit element in £ 
space is smaller than a unit element in x space; the converse holds for x 1 G (1, oo). 
The metric tensor is 



9kl 



dx k dx 1 
For example for k = 1, 1 = 1 we get 



dx k dx 1 



9u 
9u 



da; 1 dx 1 dx 1 dx 1 dx 1 dx 1 
cos 2 x 2 + sin 2 a; 2 + = 1. 



Repeating this operation, we find the complete metric tensor is 

1 



9ki 




(x 1 ) 2 
1 

detg fci = (a: 1 )' 



d£ 3 Of 



dx k dx 1 dx k dx 1 



d£ 3 df 

dx 1 dx 1 



(1.140) 

(1.141) 
(1.142) 

(1.143) 
(1.144) 
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This is equivalent to the calculation in Gibbs notation: 

G = J T J, (1.145) 

/ cos a; 2 sin a; 2 \ I cosx 2 — a; 1 sin a; 2 0\ 

G = — a^sinx 2 x 1 cosa; 2 I - I sina; 2 a^cosa; 2 , (1.146) 

\ 1/ \ 1/ 

(I 0\ 

G = (a; 1 ) 2 . (1.147) 

\0 1/ 

Distance in the transformed system is given by 

(ds) 2 = dx k g u dx 1 , (1.148) 

(dsf = dx T -G-dx, (1.149) 

{dsf = {dx 1 dx 2 da; 3 ) I (x l Y I I dx 1 I , (1.150) 




lix x \ 

(dsf = (dx 1 (x^dx 2 dx 3 ) \ dx 2 \=dxidx l , (1.151) 

v 7"^ \<W 

dxi=dx k g k i s N / ^ 

— rfa; z 

(ds) 2 = (da; 1 ) 2 + (a; 1 dT 2 ) 2 + (da; 3 ) 2 . (1.152) 

Note: 

• The fact that the metric tensor is diagonal can be attributed to the transformation being orthogonal. 

• Since the product of any matrix with its transpose is guaranteed to yield a symmetric matrix, the 
metric tensor is always symmetric. 

Also we have the volume ratio of differential elements as 

dt 1 d£ 2 d£ 3 = x 1 dx 1 dx 2 dx 3 . (1.153) 

Now we use Eq. (jl.94|) to find the appropriate derivatives of <p. We first note that 

9 9 i n " 

cos x sin x 



(J T ) _1 = -a^sina; 2 a^cosa; 2 = sma; 2 cosx 2 Q . (1.154) 



So 




dx dx dx 

na; 2 ^f- J [ jjjj? J = | ffr ffr §§ 

dx 1 dx 2 dx 

d(, a d(, a d(, 





(1.155) 



Thus, by inspection, 



8± - C0SX ^^±- S J^1^± (1156) 

oe ~ dx 1 x 1 dx 2 ' (1 - 156) 

d(f> . 2 9<f> cos a; 2 dtp 

W = SmX dx^ + —d^ 2 - (L157) 
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So the transformed version of Eq. (|1.98p becomes 

cosx 2 ^-!^^) + ( sinx 2 ^± + ^_^±) = ( x i) 2 , (i.i5 8 ) 

dx 1 x 1 dx 2 J \ dx 1 x 1 dx 2 J 

<c„^ + 8i ^)|j + (22!£!_^)gL = ,,.)*. (1 . 159) 

I 



1.3.2 Covariance and contravariance 

Quantities known as contravariant vectors transform locally according to 

u x = ^f-u 3 . (1.160) 

We note that "local" refers to the fact that the transformation is locally linear. Eq. fll.lGOp is 
not a general recipe for a global transformation rule. Quantities known as covariant vectors 
transform locally according to 

dx j 

dx 1 

Here we have considered general transformations from one non-Cartesian coordinate system 
(x 1 , x 2 , x 3 ) to another (x 1 , x 2 , x 3 ). Note that indices associated with contravariant quantities 
appear as superscripts, and those associated with covariant quantities appear as subscripts. 
In the special case where the barred coordinate system is Cartesian, we take U to denote 
the Cartesian vector and say 

W = glu*, U t = ™ Uj . (1.162) 



Ui = -^—Uj. (1.161) 



I 

Example 1.5 

Let's say {x, y, z) is a normal Cartesian system and define the transformation 

x = Xx, y = Xy, z = Xz. (1.163) 

Now we can assign velocities in both the unbarred and barred systems: 

(1.164) 

(1.165) 

(1.166) 
(1.167) 
(1.168) 
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u x 


dx 
'dt 7 


dy z dz 


u 1 ' 


dx 

~dt' 


-y d V -z dz 


u T 


dx dx 
dx dt ' 


-v _ dy dy -s _ dz dz 
dy dt ' dz dt 


U x 


= Xu x , 


vP = \u v , u z = Xu z , 


fi 1 ' 


dx x 

dx 


-y d V y -z dz z 

u y = — — u y , u = —u . 
dy dz 
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This suggests the velocity vector is contravariant. 

Now consider a vector which is the gradient of a function f(x, y, z). For example, let 

f(x,y,z)=x + y 2 + z 3 , (1.169) 

(1.170) 



In the new coordinates 



df 
ox 


df 
Uy ~ dy' 


df 


U x = 1, 


u y = 2y, 


o 2 

U z = 6Z . 



(1.171) 

„ (x v z\ x y 2 z 3 _ , 

/ (v!'a) = a + ^ + ^ (L172) 

/(x ) y ) *) = | + £ + ^. (1.173) 



Now 



dx 1 y dy' z dz' 

1 _ 2y - _3z 2 

A' " s " A3' Uz ~ " A3 






"5 = -, Uy = -7a, Uz = — . (1.175) 



In terms of x, y, z, we have 



- I - 2y _ 3z 2 

u x = -, Uy = —, u s =—. (1.176) 



So it is clear here that, in contrast to the velocity vector, 



X X X 

More generally, we find for this case that 



111 , 

u x , u v = -u y , u s = -u z . (1.177) 



dx dy _ dz 

dE Ux ' Uy = ~b~y Uy ' Us = ~di 



Uy = —Uy, Ug = -m Zj (1.178) 



which suggests the gradient vector is covariant. 

Contravariant tensors transform locally according to 

dx 1 dx 3 



dx k dx 1 
Covariant tensors transform locally according to 



Mixed tensors transform locally according to 

dx 1 dx 1 

v = — 

3 dx k dxi 
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v kl . (1.179) 



dx k dx 1 
* = WdEJ Vkl - (L180) 



v k . (1.181) 
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Figure 1.5: Contours for the transformation x l = ^ l + (£ 2 ) 2 , x 2 = £ 2 + (£ 1 ) 3 (left) and a 
blown-up version (right) including a pair of contravariant basis vectors, which are tangent 
to the contours, and covariant basis vectors, which are normal to the contours. 



Recall that variance is another term for gradient and that co- denotes with. A vector which 
is co- variant is aligned with the variance or the gradient. Recalling next that contra- denotes 
against, a vector which is contra- variant is aligned against the variance or the gradient. 
This results in a set of contravariant basis vectors being tangent to lines of x % = C, while 
covariant basis vectors are normal to lines of x l = C . A vector in space has two natural 
representations, one on a contravariant basis, and the other on a covariant basis. The 
contravariant representation seems more natural because it is similar to the familiar i, j, and 
k for Cartesian systems, though both can be used to obtain equivalent results. 



For the transformation x l = £* + (£ 



2\2 



i 2 + (£ 1 ) 3 ; Figure 11.51 gives a plot of a 



set of lines of constant x 1 and x 2 in the Cartesian ^ 1 ,^ 2 plane, along with a local set of 
contravariant and covariant basis vectors. Note the covariant basis vectors, because they 
are directly related to the gradient vector, point in the direction of most rapid change of x l 
and x 2 and are orthogonal to contours on which x 1 and x 2 are constant. The contravariant 
vectors are tangent to the contours. It can be shown that the contravariant vectors are 
aligned with the columns of J, and the covariant vectors are aligned with the rows of J -1 . 
This transformation has some special properties. Near the origin, the higher order terms 



become negligible, and the transformation reduces to the identity mapping x 1 ~ £ l , 



x 



e. 



As such, in the neighborhood of the origin, one has J = I, and there is no change in 
area or orientation of an element. Moreover, on each of the coordinate axes x l = £ l and 
x 2 = £ 2 ; additionally, on each of the coordinate axes J = 1, so in those special locations the 
transformation is area- and orientation-preserving. This non-linear transformation can be 
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shown to be singular where J = 0; this occurs when £ 2 = l/(6(^ 1 ) 2 ). As J — > 0, the contours 
of t; 1 align more and more with the contours of £ 2 , and thus the contravariant basis vectors 
come closer to paralleling each other. When J = 0, the two contours of each osculate. At 
such points there is only one linearly independent contravariant basis vector, which is not 
enough to represent an arbitrary vector in a linear combination. An analog holds for the 
covariant basis vectors. In the first and fourth quadrants and some of the second and third, 
the transformation is orientation-reversing. The transformation is orientation-preserving in 
most of the second and third quadrants. 



I 

Example 1.6 

Consider the vector fields defined in Cartesian coordinates by 

a) ^*= (|a) , b) W=(f e y (1.182) 

At the point 

P:(f 2 ) = (\), (1.183) 

find the covariant and contravariant representations of both cases of U l in cylindrical coordinates. 

a) At P in the Cartesian system, we have the contravariant 

' M ={])■ (1-184) 

6=1,6=1 ^ ' 



IT 



For a Cartesian coordinate system, the metric tensor gij = 6ij = gji = 8ji. Thus, the covariant 
representation in the Cartesian system is 

u . =gjiU i= s .. U i = ^ Q ;)(j) = (|). (hiss) 

Now consider cylindrical coordinates: f 1 = x 1 cosx 2 , £ 2 = x 1 sinx 2 . For the inverse transformation, let 
us insist that J > 0, so x 1 = ^/(^ 1 ) 2 + (£ 2 ) 2 > x 2 = tan _1 (^ 2 /^ 1 ). Thus, at P we have a representation 
of 



/ v/2 



i i ) = i v i 



For the transformation, we have 



(J ( J )2 ) 



At P, we thus have 



— -1 \ ^ t t T (I 



2 



h ! ' G = J -^U 2 )- ( 1 - 188 ) 



Now, specializing Eq. (|1.160p by considering the barred coordinate to be Cartesian, we can say 



U l = S-v?. (1.189) 

dxi v 
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Locally, we can use the Gibbs notation and say U = J • u, and thus get u = J l ■ U, so that the 
contravariant representation is 

i ;f-0)-(-1 t)-G)-(f)- 

In Gibbs notation, one can interpret this as li+lj = \/2e r +0eg. Note that this representation is different 
than the simple polar coordinates of P given by Eq. (|1.186[) . Let us look closer at the cylindrical basis 
vectors e r and eg. In cylindrical coordinates, the contravariant representations of the unit basis vectors 
must be e r = (1, 0) T and eg = (0, 1) T . So in Cartesian coordinates those basis vectors are represented 

as 

t {l\ _ (cosx 2 -a; 1 sin a; 2 \ ( l\ _ f coax 2 \ , , 

Gr ~ J \0J ~ ^sinz 2 x^osx 2 )\0j ~\sinx 2 J' lJ " mj 

eg = J-(°A = ( COS i -f Sin f).fj) = f-f Sin f). (1.192) 

\1/ \ since a; cos a; J \1/ \ a; cos a; J 

In general a unit vector in the transformed space is not a unit vector in the Cartesian space. Note that 
eg is a unit vector in Cartesian space only when x = 1; this is also the condition for J = 1. Lastly, we 
see the covariant representation is given by Uj = u l gij. Since gij is symmetric, we can transpose this 
to get Uj = gjiU 1 : 

s)- o -(S) -OS) ■(?)-(?)■ <"-' 

This simple vector field has an identical contravariant and covariant representation. The appropriate 
invariant quantities are independent of the representation: 

U t U l = (1 1)(M=2, (1.194) 

mu* = (V2 °)(^) =2- (1-195) 

Thought tempting, we note that there is no clear way to form the representation XiX 1 to demonstrate 
any additional invariance. 

b) At P in the Cartesian system, we have the contravariant 

6 =(2)- ( L1£ 



U " - ' 26 



In the same fashion as demonstrated in part a), we find the contravariant representation of U l in 
cylindrical coordinates at P is 

j ?Ho-tt t)-o)-(t)- 

In Gibbs notation, we could interpret this as li + 2j = (3/y/2)e r + (l/2)eg. 
The covariant representation is given once again by Uj = gjiU 1 : 

ui \ u 1 / 1 \ ~m^ 



U 2/ ,2 



G -C0 = (^)-(fJ = (fJ- <^) 
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This less simple vector field has distinct contravariant and covariant representations. However, the 
appropriate invariant quantities are independent of the representation: 



UiU 1 



UiU 



1 2' 



(A K 



3 

1 

2 



(1.199) 
(1.200) 



J 



The idea of covariant and contravariant derivatives play an important role in mathemat- 
ical physics, namely in that the equations should be formulated such that they are invariant 
under coordinate transformations. This is not particularly difficult for Cartesian systems, 
but for non-orthogonal systems, one cannot use differentiation in the ordinary sense but 
must instead use the notion of covariant and contravariant derivatives, depending on the 
problem. The role of these terms was especially important in the development of the theory 
of relativity. 

Consider a contravariant vector u % defined in x % which has corresponding components U l 
in the Cartesian £\ Take u>' : and W* to represent the covariant spatial derivative of u % and 
[/ l , respectively. Let's use the chain rule and definitions of tensorial quantities to arrive at 
a formula for covariant differentiation. From the definition of contravariance, Eq. (11.1601) . 



IP 



dx 1 



Take the derivative in Cartesian space and then use the chain rule: 

dW dU* dx k 



W) 



d& 



dx k d& ' 

/ 

d fd? 



dx k V dx l 



u 



\ 



\ 



7 



dx* 



re , 



<9£ J du l \ dx 



W p 



dx k dx l 

o 2 e 



tU 



dx l dx k J d^ ' 
d£ p du l \ dx k 



dx k dx l dx 1 dx k J d^ q 

From the definition of a mixed tensor, Eq. ( 11. 18 lft . 



w, 



WF 



dx i d£ q 



q d¥ dxi ' 
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(1.202) 
(1.203) 

(1.204) 
(1.205) 



(1.206) 
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d 2 e ,dedu i \dx k dx^d^ 

- u + 7777 7777JT 77777 17777777' ( L207 ) 



dx k dx l dx l dx k J d^ q d£ p dx° : ' 



=VK D p 



d 2 f p ax fc <9x l <9£" , <9£ p <9x fc dx i d^ du l 

i £5™Z £itn £itr> 53™i £S™k ' V * / 



dx k dx l d£,i d£p dxi dx l <9£<? d£p dxi dx k ' 

d 2 £ p dx k dx l , dx 1 dx k du l 

^ l + 1777 77777 7777> (1-209) 



dx k dx l dxi Q^p g x l dxi Qx k ' 



S 



<9 2 e j- fc 9x* , , xis:k du 



s^w^^+^a?' (L210) 

<9 2 £ p <9^ , cV 

z/+— . (1.211) 



dxidx 1 d£ p dxi 

Here, we have used the identity that 

W = F " (1 ' 212) 

where <5!- is another form of the Kronecker delta. We define the Christoffe\ 12 \ symbols PL as 
follows: 

d 2 i p dx 1 
3 dxidx 1 d£ pl 

and use the term Aj to represent the covariant derivative. Thus, the covariant derivative of 
a contravariant vector u % is as follows: 

dii l 
A,V = «,] = — + !>'. (1.214) 



I 

Example 1.7 

Find V T • u in cylindrical coordinates. The transformations are 



x 1 



+ vV) 2 + (a 2 > (1-215) 

X 2 = tan- 1 ;!-), (1.216) 



The inverse transformation is 



a; 3 = £ 3 . (1.217) 

^ = ^cosi 2 , (1.218) 

i 2 = a^sino; 2 , (1.219) 

£ 3 = a; 3 . (1.220) 



12 Elwin Brun o~Christoffeil 1829-1900, German mathematician. 

\CC BY-NC-TW} 29 July 2012, Sen & Powers. 



38 CHAPTER 1. MULTI-VARIABLE CALCULUS 



(1.221) 



(1.222) 

(1.223) 
uxux" uq~ uxux- uc,~ 

=0 
Noting that all second partials of £ 3 are zero, 

r i i d2 ^ dx * i d2 ¥ dx l , 

Expanding the i summation, 



This corresponds to finding 








A l u z = w\ 


- — + r u l 


Now for i = j 






r>* = 


d 2 f p dx 1 , 

— - 1 — r M % 

dx l dx l dt; p 




= 


d 2 ^ dx 1 , 

u + 
dx l dx l dt; 1 


d 2 £ 2 dx l , d 2 £ 3 dx l , 
u \ u 
dx l dx l dt, 2 dx l dx l dt, 3 

=0 



i i d 2 t l dx 1 , d 2 j l dx 2 , d 2 ^ dx 3 , 

llU ~ dx 1 dx l dC U + dx 2 dx l de U + dx 3 dx l 8^ U 



=0 



d 2 t 2 dx 1 , d 2 t 2 dx 2 , d 2 i 2 dx 



3 



dx 1 dx l dt 2 dx 2 dx l d£ 2 dx 3 dx l dt; 2 

=o 

Noting that partials of x 3 with respect to t 1 and £ 2 are zero, 

i i d 2 ^ dx 1 , d 2 ^ dx 2 , >9 2 £ 2 9a: 1 , d 2 £ 2 (9x 2 , 

*' M ~ &cw ^ M + dx 2 dx i oe u + dx i dx i oe u + dx 2 dx i &e u ■ ( } 

Expanding the I summation, we get 

, , d 2 ^ 1 9a: 1 , d 2 ^ dx 1 - d 2 ^ 1 9a: 1 , 



Sa^da; 1 c^ 1 dx 1 dx 2 dt 1 dx 1 dx 3 dt; 




dx 2 dx 1 dt; 1 dx 2 dx 2 dt 1 dx 2 dx 3 dt; 1 



d 2 i 2 dx 1 , d 2 i 2 dx 1 , d 2 £ 2 5a; 1 , 

-| - - U -| -2 U -| u 

dx 1 dx 1 dt; 2 dx 1 dx 2 dt; 2 dx x dx 3 dt 2 

, o 2 e dx 2 ul + d 2 e dx 2 ^ + d 2 e ^ 3 

dx 2 dx x dt; 2 dx 2 dx 2 dt; 2 dx 2 dx 3 dt; 2 

=0 

Again removing the a; 3 variation, we get 

i i d 2 ^ dx 1 ! d 2 ^ dx 1 2 d 2 ^ 1 dx 2 t d 2 t l dx 2 2 

tlU dx 1 dx 1 dt, 1 u + dx 1 dx 2 dt, 1 u + dx 2 dx 1 dt 1 u + dx 2 dx 2 dt 1 u 

d 2 e dx 1 1 d 2 e dx 1 2 , d 2 e dx 2 1 d 2 e dx 2 2 



dx x dx x dt; 2 dx 1 dx 2 dt; 2 dx 2 dx 1 dt; 2 dx 2 dx 2 dt; 2 
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— sin x 2 ^ 


K- 


- x cos a; | 


^ — sina; 2 \ 2 




X 1 j 


v x 1 ) J 




f cos x 2 N 


k- 


- x sin a; 


^cosa; 2 \ 2 


(1.229) 


I x l , 


v ^ r ' 










(1.230) 
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Substituting for the partial derivatives, we find 

T-ii ( n 1 -2 2 2 -2 

1 jjit = I) it -sun cos a; u -smi 
+0m + cos x sin a; w + cos x 



So, in cylindrical coordinates 

T du 1 du 2 du 3 u 1 

dx 1 dx 2 dx 3 x 1 

Note: In standard cylindrical notation, x 1 = r,x 2 = 8,x 3 = z. Considering u to be a velocity vector, 
we get 

_ T d (dr\ d (d0\ d (dz\ 1 / dr\ 

r dr V dt J r 89 \ dt J dz \dt J y ' 

„T 15/ v 1 dug du z 

V T -u = -— (ru r ) + —-! + — ±. 1.234 

r or r do az 

Here we have also used the more traditional ug = r(d0/dt) = x u , along with u r = u l ,u z = u 3 . For 
practical purposes, this insures that u r , ug, u z all have the same dimensions. 

I 



I 

Example 1.8 

Calculate the acceleration vector du/dt in cylindrical coordinates. 

Start by expanding the total derivative as 

du du rp _ 

Now, we take u to be a contravariant velocity vector and the gradient operation to be a covariant 
derivative. Employ index notation to get 

d i - %+<"**. < L235 > 

du 1 ■ ( du 1 ,. A . 

= 7* + *U* +I >J- d-236) 

After an extended calculation similar to the previous example, one finds after expanding all terms that 

(1.237) 

ICC BY-NC-TW} 29 July 2012, Sen & Powers. 




40 



CHAPTER 1. MULTI-VARIABLE CALCULUS 



The last term is related to the well known Corioliq^j and centripetal acceleration terms. However, these 
are not in the standard form to which most are accustomed. To arrive at that standard form, one must 



return to a so-called physical representation. Here again take x 1 



r, x 



8, and x = z. Also take 



u r = dr/dt = u 1 , ug = r(d8/dt) = x l u 2 , u z = dz/dt = u 3 . Then the r acceleration equation becomes 

du r " ,2 



du r du r du r ug du r 

~dT = ~~dt +Ur lfr + T~d9 



dz 



r 

centripetal 



(1.238) 



Here the final term is the traditional centripetal acceleration. The acceleration is slightly more 
complicated. First one writes 



±(M\ = d_ (df\ dr_ d_ fdS\ Md_ (M\ dzd_ (M\ %f t 
dt\dt) dt\dt) dt dr \dt) dtd8\dt) dtdzKdt) r 



(1.239) 



Now, here one is actually interested in dug/dt, so both sides are multiplied by r and then one operates 
to get 



dug 
~~dT 



d (d6\ dr d ( d6\ dO d fd8\ dz d fdO 
r dt \~dt) +r ~di~dr~ \dt) +7 *di 90 \~di ) +r ~diWz \db 



drd6 

i 

'dt dt' 



(1.240) 



d ( d9\ dr fd_fde\ dO 



rf t d 



r 38 V dt 



d6 



dug dug Ug dug 

dt dr r d8 



dug 



U r Ug 

r 

Coriolis 



dz d f dd\ dr (r§) , 

Ttd-zWtJ^Tt-f 1 ^ 2 ^ 

(1.242) 



The final term here is the Coriolis acceleration. The z acceleration then is easily seen to be 



du z 
dt 



du z 
~dT 



du z 
dr 



Ug du z 

7¥ 



du z 



(1.243) 



We summarize some useful identities, all of which can be proved, as well as some other 
common notation, as follows 



<9f d? 



ykt 




dx k dx l ' 


9 


= 


detg i:j , 


9ik9 kj 


= 


gl=g) = 5l = 8) = 8 lJ = 5 l \ 


":, 


= 


u l gij, 


u l 


= 


g tJ Uj, 


T 
u • V 


= 


Ui v % = u l Vi = u l gijV 3 = Uig %3 Vj, 


U X V 


= 


e ijk g jm gknU m v n = e ijk UjV k , 



(1.244) 

(1.245) 
(1.246) 
(1.247) 
(1.248) 
(1.249) 
(1.250) 



13 Gaspard-Gustave Coriolis, 1792-1843, French mechanician. 
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d 2 £P dx i 1 ip fdg pj t dg pk dg jk 



dxidx k d£p " T \dx k dxi " dxv 
du' 
dxi 



Vu A jU l = u\ = ^L + r* u', 

<9u l ■ , 9 



dl vu = V-.u = A (K W ( =- + I><=-^ (v /^<), 

curlu = Vxu = e% = e^ = e%f^- + r^l, 

rfu <9u T <9u 4 -(9n l , , ■ 

dt dt dt dx? 3i 

d(f) 
grad <j> = V(j) = 0,i = — , 

1 9 W^ 



^Jgdxi \ dx 

Q T ij 

dx k 



■(si 

vt T l i ---- - 7rir + rj fc r« + rj k T il , 



8T ij ' 

divT = V T -T = T'j = -— + ri j T'' + rj j T", 



(^r 0+ r;,r* = ^A(^ T «|i 



1.251) 

1.252) 
1.253) 

1.254) 

1.255) 
1.256) 
1.257) 

1.258) 

1.259) 
1.260) 
1.261) 



^fgdxi v )3 y/g dxi 

1.3.3 Orthogonal curvilinear coordinates 

In this section we specialize our discussion to widely used orthogonal curvilinear coordinate 
transformations. Such transformations admit non-constant diagonal metric tensors. Because 
of the diagonal nature of the metric tensor, many simplifications arise. For such systems, 
subscripts alone suffice. Here, we simply summarize the results. 

For an orthogonal curvilinear coordinate system (qi,q2,Q 3 ), we have 



where 



We can show that 



ds 2 = (Mgi) 2 + {h 2 dq 2 f + (h 3 dq 3 )\ (1-262) 

(1.263) 



OX\ \ / OX2 \ I UX3 



dq t J \dqij \ dq 



grad</> = V</> = 7-^! + — -^e 2 + — -^e 3 , (1.264) 

hi dqi hi dq 2 h 3 dq 3 

-1/0 o o \ 

divu = V T -u = ——(—( Ul h 2 h 3 ) + —(u 2 h 3 h 1 ) + —{u 3 h 1 h 2 )), (1.265) 

aift 2 /i3 \oqi dq 2 dq 3 J 
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curl u = V x u 



div grad 6 = V 



1 


h 1 e 1 
a 

dq\ 

U\h\ 
\dq~A 


h 2 e 2 h 3 e 3 , 
a a 

dq 2 dq 3 

u 2 h 2 u 3 h 3 

'h 2 h 3 d ( j ) \ 


d 


f h 3 hi dcj) ^ 
V h 2 dq 2 J 


dq 3 


(1.266) 


hih 2 hz 
1 


hih 2 hz 


v hi dqxj dq 2 


\ h 3 dq 3 ) ) ' 
(1.267) 



I 

Example 1.9 

Find expressions for the gradient, divergence, and curl in cylindrical coordinates (r, 8, z) where 



■I'l 


- 


r cos 


Xl 


= 


r sin 


X3 


= 


z. 



(1.268) 
(1.269) 
(1.270) 



The 1,2 and 3 directions are associated with r, 8, and z, respectively. From Eq. ()1.263|) . 
factors are 



the scale 



he 



, 


' dx\ 


\ 2 , 


' ' dx 2 


\ 2 


(dx^ 


2 


\ 




+ 




+ 


— - 


) 


V 


s dr , 


/ 


\ dr , 

8, 


I 


\ dr , 


/ 


V cos 2 8 + sin 2 




1, 














,/f 


' dx\ 


u 


f dx 2 


)■♦ 

9. 


(dx^ 


2 
\ 


V 1 


,98 , 


yd8 , 


K98 , 


) ' 


Vr 


2 sin 


8 + r- 


! cos 2 




>•■ 















f dx 1 
V dz 



fdx 2 \ fdx 3 Y 
V dz ) \ dz ) 



(1.271) 

(1.272) 
(1.273) 

(1.274) 

(1.275) 
(1.276) 

(1.277) 
(1.278) 



so that 



grad (j) 
div u 

curl u 



- ( d 



ld<j) 
dr r r 88 



dz 



Jr M + ^ W + ^ M 



e r re e e z 

JL JL JL 

dr 89 dz 

U r Ugr U z 



du r 




Mr 




1 due 




du- 





+ 


— 


+ 





+ 




Ur 




r 




r d8 




dz 



(1.279) 

(1.280) 

(1.281) 
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1.4 Maxima and minima 

Consider the real function f(x), where x G [a, b]. Extrema are at x = x m , where f'(x m ) = 0, 
if x m G [a, b]. It is a local minimum, a local maximum, or an inflection point according to 
whether f"(x m ) is positive, negative or zero, respectively. 

Now consider a function of two variables f(x,y), with x G [a,b], y G [c,d\. A necessary 
condition for an extremum is 

df df 

^-Om, y m ) = 7-^m, y m ) = 0. (1.282) 

ax ay 



where x m G [a, b], y m G [c, d]. Next, we find the Hessian 14 ! matrix: 

/ aV a 2 / \ 
H = jg ^ . (1.283) 

\ fee*?/ ch/ 2 / 

We use H and its elements to determine the character of the local extremum: 



• / is a maximum iid 2 f/dx 2 < 0, d 2 f/dy 2 < 0, and d 2 f/dxdy < ^(d 2 f/dx 2 )(d 2 f/dy 2 ) } 

• f is a minimum \id 2 f/dx 2 > 0, d 2 f/dy 2 > 0, and d 2 f/dxdy < ^ (d 2 f / dx 2 )(d 2 f / dy 2 ) , 

• f is a saddle otherwise, as long as det H/0, and 

• if det H = 0, higher order terms need to be considered. 

Note that the first two conditions for maximum and minimum require that terms on the 
diagonal of H must dominate those on the off-diagonal with diagonal terms further required 
to be of the same sign. For higher dimensional systems, one can show that if all the eigen- 
values of H are negative, / is maximized, and if all the eigenvalues of H are positive, / is 
minimized. 

One can begin to understand this by considering a Tayloil 15 ! series expansion of f(x,y). 
Taking x = (x,y) T and rfx = (dx,dy) T , multi-variable Taylor series expansion gives 

/(x + dx) = /(x) + dx T • V/ +dx T • H • dx + . . . . (1.284) 

=o 

At an extremum, V/ = 0, so 

/(x + dx) = /(x) + dx T • H • dx + . . . . (1.285) 

Later (see p. 12761 and Sec. 18.2.3.80 . we shall see that, by virtue of the definition of the term 
"positive definite," if the Hessian H is positive definite, then for all dx, dx T • H ■ dx > 0, 
which corresponds to a minimum. For negative definite H, we have a maximum. 



14 Ludwig Otto Hesse, 1811-1874, German mathematician, studied under Jacobi. 
15 Brook Taylor, 1685-1731, English mathematician, musician, and painter. 
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I 

Example 1.10 

Consider extrema of 

f = x 2 -y 2 . (1.286) 

Equating partial derivatives with respect to x and to y to zero, we get 

df 

-J- = 2x = 0, (1.287) 

ox 

-J- = -2y = 0. (1.288) 

dy 



This gives x = 0, y = 0. For these values we find that 

/ a 2 / _aV_ \ 
H = (g_ ^ 1 , (1.289) 

V dxdy dy 2 / 

o - 2 )- ^ 29 °) 

Since det H = —4 7^ 0, and d 2 f /dx 2 and d 2 f /dy 2 have different signs, the equilibrium is a saddle point. 

I 



1.4.1 Derivatives of integral expressions 

Often functions are expressed in terms of integrals. For example 



y(x)= [ f(x,t)dt (1.291) 

Ja(x) 



Here t is a dummy variable of integration. Leibniz's^ rule tells us how to take derivatives 
of functions in integral form: 

y(x) = [ f(x,t)dt, (1.292) 

Ja{x) 

dy(x) „. ,. ..db(x) „, , ..da(x) [ b ^ x > df(x.t) , .„ . 

-^ = f( x ,b{x))—^-f{x,a{x))—^+ JK J dt. (1.293) 

dx dx dx J a M ox 

Inverting this arrangement in a special case, we note if 



y(x) = y(x ) + / f(t) dt, (1.294) 

then 



16 Gottfried Wilhelm von Leibniz, 1646-1716, German mathematician and philosopher of great influence; 
co-inventor with Sir Isaac Newton, 1643-1727, of the calculus. 
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dy{x) 

dx 
dy(x) 

dx 



dx 



dx 



f(x)—-f{x )-l + 



dx 



d.r 



df(t) 

dx 



dt, 



f^). 



(1.295) 
(1.296) 



Note that the integral expression naturally includes the initial condition that when x = Xq, 
y = y(xo)- This needs to be expressed separately for the differential version of the equation. 



I 

Example 1.11 

Find dy/dx if 



y(x) 



r 2 



+ l)t 2 dt. 



Using Leibniz's rule we get 

dy(x) 
dx 



((x + l)x 4 )(2x) - {{x + l)a; 2 )(l) + / t 2 dt, 



2x 6 + 2x 5 - x 3 - x 1 



3 
3 ~ 3 



2x s + 2x 5 - x 3 - x 2 + — - —, 



7X 6 . AX 3 r, 

h2x 5 x 2 . 

3 3 



In this case it is possible to integrate explicitly to achieve the same result: 



y{x) 



(x + l) / r dt 



(x + l) 
(x + 1) 



t 6 

J 

x e x" 

T ~ "3" 



y(x) = 

dy(x) 
dx 

So the two methods give identical results. 



x 7 x e x 4 x 3 



3 3 3 3 ' 

7x 6 n 5 Ax 3 2 

V2x b x 2 . 

3 3 



(1.297) 

1.298) 

1.299) 

1.300) 

1.301) 
1.302) 

1.303) 

1.304) 

1.305) 

1.306) 
1.307) 
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1.4.2 Calculus of variations 

The problem is to find the function y(x), with x G [x\, £2], and boundary conditions y{x\) 
2/i> y (#2) = 2/2, such that 

f{x,y,y') dx, 



I 



1.308) 



•i'i 



is an extremum. Here, we find an operation of mapping a function y(x) into a scalar J, 
which can be expressed as I = J"(y). The operator T which performs this task is known as 
a functional. 

If y(x) is the desired solution, let Y(x) = y(x) + eh(x), where h(xi) = h(x2) = 0. Thus, 
Y(x) also satisfies the boundary conditions; also Y'(x) = y'(x) + eh'(x). We can write 



m 



J'2 



f(x,Y,Y') dx. 



Taking dl/de, utilizing Leibniz's rule, Eq. (|1.293[) . we get 

/ 



dl_ 
Ye 



j- 1 



df dx df dY df BY' 
dx de dY de dY' de 

h(x) h'(x)J 



dx. 



Evaluating, we find 



dl 
de 



— + j^h(x) + ^rjh'(x) ) dx. 



dx dY 



dY' 



Since I is an extremum at e = 0, we have dl/de = for e = 0. This gives 

** 2 'df 







dY Kx) ' dY 



-h{x y 



dx. 



e=0 



Also when e = 0, we have Y = y,Y' = y', so 

= 



( T—h(x) + —-h'(x) I dx. 
Xl \dy 



dy' 



Look at the second term in this integral. Since from integration by parts we get 



X2 df f X2 df dh 

h ' {x) dx = I dx 



X2 ^-dh 

i Q y ,- \~J " J^ Q y l d% " J^ Qy, 

%2 PX 2 



W h{x) 



(1.309) 



(1.310) 



(1.311) 



1.312) 



;i.313) 



;i.3i4) 



[-^Tx{w) h{x)dxA1 - 315) 



— I — — ) h(x) dx. 
1 dx \dy' I 



[1.316) 
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The first term in Eq. (I1.315P is zero because of our conditions on h{x\) and h{x%). Thus, 
substituting Eq. (1 1 . 3 1 6 1) into the original equation, Eq. (II. 313ft . we find 

V v ' 



The equality holds for all h(x), so that we must have 

dy dx \dy' J 

This is called the Euler^Jj- Lagrange^ equation; sometimes it is simply called Euler's equation. 
While this is, in general, the preferred form of the Euler- Lagrange equation, its explicit 
dependency on the two end conditions is better displayed by considering a slightly different 
form. By expanding the total derivative term, that is 

^ f^L( X y y')\ = ^L IE + J^L d JL | d "f d l_ (1 319) 

dx \dy' ' ' / dy'dx dx dy'dy dx dy'dy' dx 

= 1 y> y» 

d 2 f , d 2 f d 2 f „ 

y + P /P , y , (1.320) 



dy'dx dy'dy dy'dy 
the Euler- Lagrange equation, Eq. f 1 1 . 3 1 8 1) . after slight rearrangement becomes 

d 2 f „ d 2 f , d 2 f df 
dy'dy' dy'dy dy'dx dy 

d v dy 

^ y ' y 'dx 2+ ^ y ' y dx + ^ y ' x ~^ = °' (1-322) 

This is clearly a second order differential equation for f y > y > ^ 0, and in general, non-linear. 
If fyiyi is always non-zero, the problem is said to be regular. If f y i y i = at any point, the 
equation is no longer second order, and the problem is said to be singular at such points. 
Note that satisfaction of two boundary conditions becomes problematic for equations less 
than second order. 

There are several special cases of the function /. 

• / = f(z,v) ■ 

The Euler- Lagrange equation is 

df 

7T = 0, 1-323 

dy 

which is easily solved: 

f(x,y) = A(x), (1.324) 

which, knowing /, is then solved for y(x). 



17 Leonhard Euler, 1707-1783, prolific Swiss mathematician, born in Basel, died in St. Petersburg. 
18 Joseph-Louis Lagrange, 1736-1813, Italian-born French mathematician. 
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f = f(x,y'): 






The Euler-Lagrange equation is 








dx 


m- 


which yields 




df _ 
dy' 



1.325) 



(1.326) 
f(x,y') = Ay' + B(x). (1.327) 

Again, knowing /, the equation is solved for y' and then integrated to find y(x). 

f = f(y,y')- 

The Euler-Lagrange equation is 

£ - 7- (IW)) = 0, (1-328) 

dy ax \dy J 

df ( d 2 f dy d 2 f dy'\ 

dy \dydy' dx dy'dy' dx ) 

df d 2 f dy d 2 f d 2 y 



dy dydy' dx dy'dy' dx 2 
Multiply by y' to get 



0. (1.330) 



(df d 2 f dy d 2 f d 2 y 
\ dy dydy' dx dy'dy' dx 2 1 



Add and subtract (df /dy')y" to get 



y , fd_£ _ JPf_dy_ _ d 2 f d 2 y\ + 5/ _ 8f_ = Q 

\ dy dydy' dx dy'dy' dx 2 J dy' dy' 



Regroup to get 



d ± y < + ?ltf _ L> {^l_ d JL + JPj_fy\ + 9£ y/f \ = Q 

dy dy' \ \ dydy' dx dy'dy' dx 2 J dy' J 



=df/dx =d/dx(y'df/dy') 

Regroup again to get 

dx \ J a dy 
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which can be integrated. Thus, 

f(y,y')-y , ^ = K, (1.335) 

where K is an arbitrary constant. What remains is a first order ordinary differen- 
tial equation which can be solved. Another integration constant arises. This second 
constant, along with K, are determined by the two end point conditions. 



I 

Example 1.12 

Find the curve of minimum length between the points (xi, y\) and (x2,2/2)- 
If y{x) is the curve, then y(x\) = y\ and 2/(2:2) = 2/2- The length of the curve is 

rx 2 

L= V 1 + (?/) 2 dx - (1.336) 

J Xl 



So our / reduces to f(y') = \/l + {y') 2 - The Euler-Lagrange equation is 



which can be integrated to give 



Solving for y' we get 



from which 



dx Vvi+WP 



Vi + (y') 2 



(1.337) 



K. (1.338) 



i^5^ ( L339 ) 



y = Ax + B. (1.340) 



The constants A and B are obtained from the boundary conditions y(xi) = y\ and y(x 2 ) = 2/2- The 
shortest distance between two points is a straight line. 

I 



I 

Example 1.13 

Find the curve through the points (xi, y\) and (x2, 2/2), such that the surface area of the body of 
revolution by rotating the curve around the a;-axis is a minimum. 

We wish to minimize 

rx 2 

1=1 2/Vl + (y') 2 dx. (1.341) 
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-1 -0.5 



curve with 
endpoints at 

(-1,3.09), (2, 2.26) 
which minimizes 
surface area of body 
of revolution 




corresponding 
surface of 
revolution 



Figure 1.6: Body of revolution of minimum surface area for {xi,y{) = (—1,3.08616) and 
(x 2 ,y 2 ) = (2,2.25525). 



Here / reduces to f(y, y') = yyl + {y') 2 - So the Euler-Lagrange equation reduces to 

A, (1.342) 

uy 



f( n , d f 

f(y,y)-y-Qp 



y 



VT 



y' 2 - y'y 



Vi + y^ 

ft i '2\ 12 

y{i + y ) -yy 

!J 
i 

y 
y(x) 



Avi + V 2 , 
A^i + y' 2 , 

W 7 - 



A cosh ■ 



x-B 



(1.343) 

(1.344) 
(1.345) 

(1.346) 
(1.347) 



This is a catenary. The constants A and B are determined from the boundary conditions y{x\) = y\ 
and y{x2) = yi- In general this requires a trial and error solution of simultaneous algebraic equations. 
If (xi,?/i) = (—1,3.08616) and {xi-,y-i) = (2,2.25525), one finds solution of the resulting algebraic 
equations gives A = 2, B = 1. 

For these conditions, the curve y(x) along with the resulting body of revolution of minimum surface 
area are plotted in Fig. 11.61 

I 



1.5 Lagrange multipliers 

Suppose we have to determine the extremum of f(x\, x 2 , • • • , %) subject to the n constraints 

g n (xi,x 2 ,...,x M ) = 0, n = 1,2, ...,N. (1.348) 
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Define 

f* = f- A1S1 - \ 2 92 - ... - X N g N , (1.349) 

where the X n (n = 1,2, • • • , N) are unknown constants called Lagrange multipliers. To get 
the extremum of /*, we equate to zero its derivative with respect to Xi,X2,--- ,xm- Thus, 
we have 

df* 

-L- = o, m = 1,...,M, (1.350) 

ox m 

g n = 0, n=l,...,N. (1.351) 

which are (M + N) equations that can be solved for x m (m = 1,2,... ,M) and A n (n = 

1,2,. ..,7V). 



I 

Example 1.14 

Extremize / = x 2 + y 2 subject to the constraint g = 5x 2 — 6xy + 5y 2 — 8 = 0. 



Let 



from which 



From Eq. ([053]) . 



/* = x 2 + y 2 - \(5x 2 - Qxy + by 2 - 8), (1.352) 



df* 

-i— = 2x-lOXx + 6Xy = 0, (1.353) 

ox 

df* 

-Z- = 2y + 6Xx-10Xy = 0, (1.354) 

dy 

g = 5x 2 - Qxy + by 2 - 8 = 0. (1.355) 



2x 

A = t^ -, (1.356) 

10a; — oy 



which, when substituted into Eq. (|1.354j) . gives 

x = ±y. (1.357) 

Equation (jl.357|) . when solved in conjunction with Eq. (|1.355[) . gives the extrema to be at (x, y) = 
(v/2, V2), (-y/2, -y/2), (1/V2, -1/V2), (-l/y/2, 1/V5). The first two sets give / = 4 (maximum) and 
the last two / = 1 (minimum). The function to be maximized along with the constraint function and 
its image are plotted in Fig. 11.71 

I 



A similar technique can be used for the extremization of a functional with constraint. 
We wish to find the function y(x), with x G [xi,X2], and y(x\) = yi,y(x2) = y 2 , such that 
the integral 

I--. J f(x,y,y')dx, (1.358) 
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f (x,y) 




f (x,y) 



unconstrained 
function 




constraint 
function 



Figure 1.7: Unconstrained function f(x,y) along with constrained function and constraint 
function (image of constrained function.) 



is an extremum, and satisfies the constraint 

9 = 0. 
Define 



I* = I- Xg, 



(1.359) 
(1.360) 



and continue as before. 



I 

Example 1.15 

Extremize I, where 

1=1 2/Vl + (y') 2 dx, 
Jo 

with 2/(0) = y(a) = 0, and subject to the constraint 



Vi + tyF dx = e. 

That is, find the maximum surface area of a body of revolution which has a constant length. 
Let 



(1.361) 



(1.362) 



g = I yj\ + (y'Y dx- £ = 0. 
'0 



Then let 



I* =1 -\ g = / yVl + (2/') 2 dx-X \fl + {y') 2 dx + U 



(1.363) 



(1.364) 
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-0.05 
-0.1 

-0.15 
-0.2 

-0.25 
-0.3 




0.2 



-0.2 




Figure 1.8: Curve of length £ = 5/4 with y(0) = y(l) = whose surface area of corresponding 
body of revolution (also shown) is maximum. 



(y-X)^l + (y') 2 dx + X£, 



X( 



(2/-A)v / T + W +- 
'o \ a 

With f* = (y— A)yl + (y') 2 + X£/a, we have the Euler-Lagrange equation 

£^1_ ±(2L\ 

dy dx \ dy' 



(1.365) 
(1.366) 

(1.367) 



Integrating from an earlier developed relationship, Eq. (|1.335j) . when / = f(y,y'), and absorbing X£/c 
into a constant A, we have 



(y-XWl + (y') 2 -y'(y-X) 



y 



v 7 ! + (y') 2 



A. 



from which 



(y-X)(l + (y'f)-(y') 2 (y-X) = A^l + {y'Y 



(y-X){l + (y') 2 -(y') 2 ) = A^l + (y') 2 , 
y-X = AVl + (y') 2 , 



y 



v 



A 



X + A cosh ■ 



1, 



x- B 



(1.368) 

(1.369) 
(1.370) 
(1.371) 

(1.372) 
(1.373) 



Here A, B, X have to be numerically determined from the three conditions y(0) = y(a) = 0, g = 0. If 
we take the case where a =1,1 = 5/4, we find that A = 0.422752, B = 1/2, A = -0.754549. For these 
values, the curve of interest, along with the surface of revolution, is plotted in Fig. 11.81 

I 
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Problems 

1. if 

z 3 + zx + x 4 y = 2y 3 , 



(a) find a general expression for 



(b) evaluate 



dz 

Ox 



dz 

Ox 



dz 

' dy 

dz 

' dy 



at (a;, y) = (1, 2), considering only real values of x, y, z, i.e. x,y, z EM 1 . 
(c) Give a computer generated plot of the surface z(x,y) for x G [—2,2], y G [—2, 2], z G [—2,2]. 

2. Determine the general curve y(x), with x G [£1,2:2], of total length L with endpoints y(x\) = 2/1 
and 2/(2:2) = Vi fixed, for which the area under the curve, J 2 y dx, is a maximum. Show that if 
(* T i,2/i) = (0,0); (2:2,2/2) = (1, 1)]L = 3/2, that the curve which maximizes the area and satisfies all 
constraints is the circle, (y + 0.254272) 2 + (x - 1.2453) 2 = (1.26920) 2 . Plot this curve. What is the 
area? Verify that each constraint is satisfied. What function y(x) minimizes the area and satisfies all 
constraints? Plot this curve. What is the area? Verify that each constraint is satisfied. 

3. Show that if a ray of light is reflected from a mirror, the shortest distance of travel is when the angle 
of incidence on the mirror is equal to the angle of reflection. 

4. The speed of light in different media separated by a planar interface is c\ and Ci- Show that if the 
time taken for light to go from a fixed point in one medium to another in the second is a minimum, 
the angle of incidence, on, and the angle of refraction, a r , are related by 

sinc^ c\ 
sin a r C2 

5. T is a quadrilateral with perimeter P. Find the form of T such that its area is a maximum. What is 
this area? 

6. A body slides due to gravity from point A to point B along the curve y = f(x). There is no friction 
and the initial velocity is zero. If points A and B are fixed, find fix) for which the time taken will 
be the least. What is this time? If A : (x,y) = (1,2), B : (x,y) = (0,0), where distances are in 
meters, plot the minimum time curve, and find the minimum time if the gravitational acceleration is 
g = -9.81 m/s 2 j. 

7. Consider the integral / = L(y / — y + e x ) 2 dx. What kind of extremum does this integral have 
(maximum or minimum)? What should y(x) be for this extremum? What does the solution of 
the Euler-Lagrange equation give, if 2/(0) = and 2/(1) = — e? Find the value of the extremum. 
Plot y(x) for the extremum. If yo(x) is the solution of the Euler-Lagrange equation, compute / for 
2/i(x) = 2/0(2:) + h(x), where you can take any h(x) you like, but with h(0) = h(l) = 0. 

8. Find the length of the shortest curve between two points with cylindrical coordinates (r, 0, z) = (a, 0, 0) 
and (r, 6, z) = (a, 0, Z) along the surface of the cylinder r = a. 

9. Determine the shape of a parallelogram with a given area which has the least perimeter. 
ICC BY-JVC-MXI 29 July 2012, Sen & Powers. 



1.5. LAGRANGE MULTIPLIERS 55 

10. Find the extremum of the functional 

{x 2 y n + 40x 4 y) dx, 
o 

with j/(0) = and y(l) = 1. Plot y(x) which renders the integral at an extreme point. 

1 1 . Find the point on the plane ax + by + cz = d which is nearest to the origin. 

12. Extremize the integral 

l 

y 2 dx, 

subject to the end conditions y(0) = 0, y(l) = 0, and also the constraint 

• l 

y dx = 1. 
o 

Plot the function y{x) which extremizes the integral and satisfies all constraints. 

13. Show that the functions 

x + y 

u = , 

x-y 

xy 



(x- y) 21 
are functionally dependent. 

14. Find the point on the curve of intersection of z — xy = 10 and x + y + z = 1, that is closest to the 
origin. 

15. Find a function y{x) with y(0) = 1, j/(l) = that extremizes the integral 



(i; s 



dx. 



Plot y{x) for this function. 
16. For elliptic cylindrical coordinates 



£ = cosh a; cos a: , 
£ = sinhx sin a; , 
£ 3 = a; 3 . 

Find the Jacobian matrix J and the metric tensor G. Find the transformation x % = x l {£, 3 )- Plot lines 
of constant x 1 and x 2 in the t; 1 and f 2 plane. 

17. For the elliptic coordinate system of the previous problem, find V T ■ u where u is an arbitrary vector. 

18. For parabolic coordinates 



e - 


12 ^1 

= xx cos:r 


e - 


12- 3 

= xx sinx . 


e - 


■ \((*'f- 



(a; 1 ) 2 ) • 

Find the Jacobian matrix J and the metric tensor G. Find the transformation x l = x l (£, 3 ). Plot lines 
of constant x 1 and a; 2 in the ^ and £ 2 plane. 
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19. For the parabolic coordinate system of the previous problem, find V T ■ u where u is an arbitrary 
vector. 

20. Find the covariant derivative of the contravariant velocity vector in cylindrical coordinates. 

21. Prove Eq. (|1.293j) using the chain rule. 
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Chapter 2 

First-order ordinary differential 
equations 



see Kaplan, 9.1-9.3, 

see Lopez, Chapters 1-3, 

see Riley, Hobson, and Bence, Chapter 12, 



see \Bender\ and \Orszag ^ 1.6. 



We consider here the solution of so-called first-order ordinary differential equations. Such 
equations are of the form 

F(x,y,y') = 0, (2.1) 

where y' = dy/dx. Note this is fully non- linear. A first order equation typically requires the 
solution to be specified at one point, though for non-linear equations, this does not guarantee 
uniqueness. An example, which we will not try to solve analytically, is 

/ 3 \ 2 

2 ( d v \ , n d V 



xy 2 [^J + 2-£- + ln{smxy)\ -1 = 0, y(l) = 1. (2.2) 

Fortunately, many first order equations, even non-linear ones, can be solved by techniques 
presented in this chapter. 



2.1 Separation of variables 

Equation (12. lj) is separable if it can be written in the form 

P(x)dx = Q{y)dy, (2.3) 

which can then be integrated. 

57 
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y 
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7.5 












5 












2.5 








-10 


-5 


-2.5 




1 5 


10 



Figure 2.1: y(x) which solves yy' = (8x + l)/y with j/(l) = —5. 



I 

Example 2.1 

Solve 



, Sx + l . , ._. 

yy = , with y(i) = -5. 



(2.4) 



Separating variables 



y dy = 8xdx + dx. 



(2.5) 



Integrating, we have 



— = 4x 2 + x + C. 



(2.6) 



The initial condition gives C = —140/3, so that the solution is 



y 3 = Ux 2 + 3x- 140. 



(2.7) 



The solution is plotted in Fig. 12.1 



ICC BY-JVC-MXI 29 July 2012, Sen & Powers. 



2.2. HOMOGENEOUS EQUATIONS 59 

2.2 Homogeneous equations 

A first order differential equation is defined by man)o as homogeneous if it can be written in 
the form 

y' = /(§)■ (2.8) 

Denning 

u= V - } (2.9) 

x 

we get 

y = ux, (2.10) 

from which 

y' = u + xu'. (2.11) 

Substituting in Eq. (]2.8]) and separating variables, we have 

u + xu' = /(«), (2.12) 

u + x^ = /(«), (2.13) 

fill 

x-r = /(«)-«> (2-14) 



da; 



/(«) - u x 



which can be integrated. 
Equations of the form 



can be similarly integrated. 



I 

Example 2.2 

Solve 



This can be written as 



Let u = y/x. Then 



(2.15) 



y' = f (wLtw+3*) , (2.16) 

\a 4 x + a 5 y + a 6 J 



v 2 
xy' = 3y + — , with y(l) = 4. (2.17) 

x 



f{u) = 3u + u 2 . (2.19) 



J The word "homogeneous" has two distinct interpretations in differential equations. In the present section, 
the word actually refers to the function /, which is better considered as a so-called homogeneous function 
of degree zero, which implies f(tx,ty) = f(x,y). Obviously f(y/x) satisfies this criteria. A more common 
interpretation is that an equation of the form L(y) = / is homogeneous iff / = 0. 
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■10 




Figure 2.2: y(x) which solves xy' = 3y + y 2 /x with y(l) = 4. 



Using our developed formula, Eq. (|2.15[) . we get 

du dx 



2u + u 2 x 
Since by partial fraction expansion we have 

1 1 1 



Eq. (|2.20[) can be rewritten as 



2u + u 2 2u 4 + 2u' 
du du dx 



2u 4 + 2m x 



Both sides can be integrated to give 



1 



-(ln|u|-ln|2 + w|) = In \x\ + C. 
The initial condition gives C = (1/2) ln(2/3), so that the solution can be reduced to 



II 



2x + y 



:x 2 . 



This can be solved explicitly for y(x) for each case of the absolute value. The first case 



-x 3 

y(x) - 



i-!- 2 ' 



(2.20) 



(2.21) 



(2.22) 



(2.23) 



(2.24) 



is seen to satisfy the condition at x = 1. The second case is discarded as it does not satisfy the condition 
at x = 1. The solution is plotted in Fig. 12.21 

I 
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2.3 Exact equations 

A differential equation is exact if it can be written in the form 

dF{x,y) = 0, (2.25) 

where F(x, y) = is a solution to the differential equation. The chain rule is used to expand 
the derivative of F(x,y) as 

dF dF 

dF = —dx + — dy = 0. (2.26) 

ox ay 

So, for an equation of the form 

P(x,y)dx + Q(x,y)dy = 0, (2.27) 

we have an exact differential if 

dF dF 

— = P(x,y), =Q( x ,y), (2.28) 

ox dy 

d 2 F 8P d 2 F dQ 

(2.29) 



dxdy dy dydx dx 

As long as F(x, y) is continuous and differentiable, the mixed second partials are equal, thus, 

T - £■ < 2 - 30 > 

ay dx 
must hold if F(x, y) is to exist and render the original differential equation to be exact. 



1 

Example 2.3 








Solve 










dy 
dx 


= 


e x ~ y 




e x-y _ i ' 




(e x - y ) dx + (1 - e x -y) dy 


= 


o, 




=P =Q 








dP 
dy 


= 


-e x ~ v , 




dQ 
dx 


= 


-e x ~ y . 


Since dP/dy = 


= dQ/dx, the equation is exact. Thus, 








dF 

dx 


= 


P(x,y), 




dF 
dx 


= 


e x ~ y , 




F(x,y) 


= 


e*~y + A(y) 



(2.31) 
(2.32) 

(2.33) 
(2.34) 

(2.35) 

(2.36) 
(2.37) 
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)- 

C=3 

C=2 
C=1 

c=o 

-4-2 2 4 

X 



Figure 2.3: y(x) which solves y' = exp(x — y)/(exp(x — y) — 1] 



dF x _ v dA „, . 

dy dy 

dA 
dy 

Mv) 

F(x,y)=e x -y + y-C 
e x ~ y + y 



l-e x ~ y , 
1. 

y-c, 

o, 

c. 



(2.38) 

(2.39) 

(2.40) 
(2.41) 
(2.42) 



The solution for various values of C is plotted in Fig. 12.3 



2.4 Integrating factors 



Sometimes, an equation of the form of Eq. ( 12.271) is not exact, but can be made so by 
multiplication by a function u(x, y), where u is called the integrating factor. It is not always 
obvious that integrating factors exist; sometimes they do not. When one exists, it may not 
be unique. 



I 

Example 2.4 

Solve 



dx 



2xy 
x 2 — y 1 



(2.43) 



Separating variables, we get 



[x — y ) dy = 2xy dx. 



(2.44) 
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~~^\ C = 3 






* 


— \ C = 2 \ 






1 

r 


C = 1 I / 


1.5 


-1 r£ 




N^a. 1 l.E 






-i 


\ 






-2 


y / z = -z 



2 „.1\ 



Figure 2.4: y[x) which solves y'(x) = 2xy/(x — y 



This is not exact according to criterion (|2.30[) . It turns out that the integrating factor is y 2 , so that 
on multiplication, we get 



2x 



— dx-l-x-1) dy = 0. 

y \y 



This can be written as 



which gives 



\- + y 



-+y = C, 

y 

x 2 +y 2 = Cy. 



The solution for various values of C is plotted in Fig. 12.41 



(2.45) 
(2.46) 

(2.47) 
(2.48) 



The general first-order linear equation 

dy(x) 



d.r 



P(x) y(x) = Q(x) 



with 



y(x ) = y , 

can be solved using the integrating factor 



(2.49) 
(2.50) 

(2.51) 
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We choose a such that 

F(a) = 0. (2.52) 

Multiply by the integrating factor and proceed: 

^s:p(s)^dy^ + ^j : p(s) ds ^ p{x)y{x) = ( e rw) g0r)) (2 . 53) 

product rule: j- (e^ P(s)ds y{x)) = (e-^ p{s)ds ^j Q{x), (2.54) 

replaces by t: j ^ p{s)ds y{t)^j = (e^ p(s)ds ) Q(£), (2.55) 

integrate: [* ^ (e J « P{s)ds y(t)) dt = f ( e ^ p{s)ds ) Q(t)dt, (2.56) 

J x a tit ' J Xo 

e L x P(.>)*> y ( x ) - e r p ^ ds y( Xo ) = f X (e£ p ^ ds ) Q(t) dt, (2.57) 

J x 

which yields 

y (x) = e - f: p ^ ds U x «° p ^ ds Vo + f (e& p{s)ds ) Q(t)dt) . (2.58) 



I 

Example 2.5 

Solve 



Here 



P{x) = 


-1, 


P{s) = 


-1, 


P{s)ds = 


f (-l)d», 

J a 


= 


\X 



y'-y = e 2x ; y(0) = Vo . (2.59) 



(2.60) 



(2.61) 

(2.62) 

(2.63) 
= a-x. (2.64) 

So 

F(t) = -t. (2.65) 

For F(a) = 0, take a = 0. So the integrating factor is 

e f: P(s)ds = e a-x = e 0-x = e -x_ ( 2 gg) 

Multiplying and rearranging, we get 
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Figure 2.5: y(x) which solves y' — y = e with y(0) = y . 



_ x dy(x) _ x . . 



rf.r 



rf.r 



(e-^(x)) = e* 



di 
^ (e-*i/(t)) di 



e*di, 

x o =0 



e-*y(x) - e- u y(0) = e x - e u , 

e~ x y(a;) - y = e x - 1, 

y(x) = e x (y + e x - 1) , 

l/(ar) = e 2x + (y -l)e x . 

The solution for various values of y is plotted in Fig. 12.51 



(2.67) 


(2.68) 


(2.69) 


(2.70) 


(2.71) 


(2.72) 


(2.73) 


(2.74) 
1 



2.5 Bernoulli equation 

Some first-order non-linear equations also have analytical solutions. An example is the 
Bernoulli equation 

y' + P(x)y = Q(x)y n . (2.75) 



2 Jacob Be rnoulli! 1654-1705, Swiss-born member of a prolific mathematical family. 
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where ti^l. Let 

so that 

The derivative is 



(2.76) 

(2.77) 



y' = - (u—Au'. (2.78) 

1 — n \ / 



u = y 1 -", 




i 

y = u 1 -" . 




1 (u^' 


\ u' 



Substituting in Eq. (12.75[) . we get 

(u^)u' + P(x)u^ = Q{x)u^. (2.79) 

1-n \ / 

This can be written as 

u' + (l-n)P(x)u=(l-n)Q(x), (2.80) 

which is a first-order linear equation of the form of Eq. (J2.49P and can be solved. 

2.6 Riccati equation 

A Riccatu equation is of the form 

= P(x)y 2 + Q(x)y + R(x). (2.81) 



dy 



dx 

Studied by several Bernoullis and two Riccatis, it was solved by Euler. If we know a specific 
solution y = S(x) of this equation, the general solution can then be found. Let 

1 



y = 


-- S(x) + — -. (2.82) 
z{x) 


thus 

dy 

dx 


dS 1 dz ,„ „„. 


Substituting into Eq. (12.811). we get 




dS 1 dz 
dx z 2 dx 


= p ( s + -j + Q ( s + -j + R < < 2 - 84 ) 


dS 1 dz 
dx z 2 dx 


= p(s? + ^ + ^+q(s+^+R, (2^85) 


^-(PS 2 + QS + R)-~ 
dx z z dx 
v v ' 

=0 


= *(¥ + ?) + «G)- (286) 


dz 
dx 


= P{2Sz + l) + Qz, (2.87) 


dz 

— + (2P(x)S(x) + Q(x))z 


= -P(x). (2.88) 



^Jacopo Riccati| 1676-1754, Venetian mathematician. 
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Again this is a first order linear equation in z and x of the form of Eq. (12.490 and can be 
solved. 



I 

Example 2.6 

Solve 



One solution is 
Verify: 



— y 2 --y + 3e 3x 
x x 



y = S(x) = e 



3x 



so let 



Also we have 



3e 3x = ^—e 6x - -e 3x + 3e 3x , 
x x 

~oX p-~>X 

3e 3x = e —- e — + Ze 3x , 
x x 

3e 3x = 3e 3x , 



y = e 3x + -. 

z 



P(x) 



e -3x 



Substituting into Eq. (|2.88|l . we get 



x 

Q(x) = --, 
x 

R( x ) = 3e 3x . 



3x -I \ „— 3x 



dz r.e Ax 3x 



. 2 e* x --)z 

dx \ x x) 

dz z e~ 3x 

+ 



dx 

which can be integrated as 



Since y = S(x) + 1/z, the solution is thus 



e~ 3x + 3C 
The solution for various values of C is plotted in Fig. 12.6 



2.89) 
2.90) 

2.91) 

2.92) 
2.93) 

2.94) 

2.95) 

2.96) 
2.97) 



2.99) 



dx x x 

The integrating factor here is 

e^=e lnx = x (2.100) 

Multiplying by the integrating factor x 

x—+z = -e~ 3x , (2.101) 

dx 

d{xz) ^_ 3x 



e~ 3x , (2.102) 



e -3x c e -3x + 3C 
z= —— + — = . (2.103) 



V = ^ + -^— 7 ;. (2.104) 
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(M O OJ 




C= -2 C= -1 



Figure 2.6: y(x) which solves y' = exp(— 3x)/x — y/x + 3exp(3x) 



2.7 Reduction of order 

There are higher order equations that can be reduced to first-order equations and then solved. 



2.7.1 y absent 

If 

f(x,y',y") = 0, 

then let u(x) = y'. Thus, u'(x) = y", and the equation reduces to 



du . 
f[x,u,-)-=0, 



which is an equation of first order. 

I 

Example 2.7 

Solve 



Let u = y' , so that 



xy" + 2y' = Ax 3 . 



x h 2u = Ax d 

ax 



(2.105) 



(2.106) 



(2.107) 



(2.108) 



ICC BY-JVC-lVLTl 29 July 2012, Sen & Powers. 



2.7. REDUCTION OF ORDER 69 



Multiplying by x 



This can be integrated to give 



from which 



n till 

x — — h 2xu 


= Ax" 


d ( 2 S 

Tx {xu) 


= Ax" 


4 3 

u = -x 6 
5 


X 2 


1 4 


Ci , n 



5 x 



for x =£ 0. 



(2.109) 
(2.110) 

(2.111) 
(2.112) 

I 



2.7.2 x absent 

If 

let u{x) = y', so that 

Equation ( 12.113ft becomes 

f I ii ii, 11, 

dyj 

which is also an equation of first order. Note however that the independent variable is now 
y while the dependent variable is u. 



f(y,y',y") = 0, (2.113) 



„ dy' dy'dy du rtl1 . 

V = -r- = -7—7- = -j-u, (2.114) 

dx dy dx dy 



f(y,u,u^) =0, (2.115) 



I 

Example 2.8 

Solve 

y"-2yy' = 0; y(0)=y o , y'(0) = y' o . (2.116) 

Let u = y' , so that y" = du/dx = (dy/dx)(du/dy) = u{du/dy). The equation becomes 

du 
u—-2yu = 0. 2.117 

dy 

Now 

u = 0, (2.118) 

satisfies Eq. (|2.117|) . Thus, 

/ = 0, (2.119) 

y = C, (2.120) 

applying one initial condition: y = y (2.121) 
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1 








-1.5 


-1 


-0.5^^ 
' -1 

-2 

-3 


0.5 


1 


1.5 



Figure 2.7: y(x) which solves y" — 2yy' = with y(0) = 0,y'(0) = 1. 



This satisfies the initial conditions only under special circumstances, i.e. y' = 0. For m/0, 

du 
dy 
U = y 1-W) 
apply I.C. 's: y„ = y„ + Ci, 



2y. 
y 2 + Ci 



rf.r 



dy 



y 2 + y'o - vl 



y'o-vl, 

2,1 2 

y +y -y i 



dx. 



(2.122) 

(2.123) 
(2.124) 
(2.125) 

(2.126) 

(2.127) 



from which for y' a — y 2 > 



: tan 



Vv'o - y 2 \ Vy'o - v 2 



: tan 



Vy'o - yl \ Vy'o - y 2 



x + C 2 , 

) =c 2 , 



y{x) = Vy'o - vl tan I xVy' Q -vl + tan 1 

The solution for y = 0, y' Q = 1 is plotted in Fig. 12.71 
For y' -y 2 = 0, 



Do 



Vy'o-y 2 , 



dy 
dx 
dy 

i,2 



y , 

dx, 



(2.128) 
(2.129) 
(2.130) 



(2.131) 

(2.132) 
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(2.133) 
(2.134) 
(2.135) 
(2.136) 

For y' — y\ < 0, one would obtain solutions in terms of hyperbolic trigonometric functions; see Sec. 110.51 



1 








= 


x + C 2 . 


y 






i 








= 


c 2 , 


Vo 






i 




1 


— 


= 


x 


y 




Vo 


y 




1 




l 






X 



2.8 Uniqueness and singular solutions 

Not all differential equations have solutions, as can be seen by considering 

</=~lny, y(0) = 2. (2.137) 

The general solution of the differential equation is y = e Cx , but no finite value of C allows 
the initial condition to be satisfied. Let's check this by direct substitution: 

V = e Cx , (2.138) 

y' = Ce Cx , (2.139) 

Cx 

l\ny = —\ne Cx , (2.140) 

x x 

Cx 

= —Cx, (2.141) 

x 

= Ce Cx , (2.142) 

= y- (2.143) 

So the differential equation is satisfied for all values of C . Now to satisfy the initial condition, 
we must have 

2 = e c(0) , (2.144) 

2 = 1? (2.145) 

There is no finite value of C that allows satisfaction of the initial condition. The original 
differential equation can be written as xy' = ylny. The point x = is singular since at that 
point, the highest derivative is multiplied by leaving only = y\ny at x = 0. For the very 
special initial condition y(0) = 1, the solution y = e Cx is valid for all values of C . Thus, for 
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this singular equation, for most initial conditions, no solution exists. For one special initial 
condition, a solution exists, but it is not unique. 

Theorem 

Let f(x, y) be continuous and satisfy \f(x, y)\ <m and the Lipschit%} condition \f(x, y) — 
f(x,yo)\ < k\y — j/o| h 1 a bounded region 1Z. Then the equation y' = f(x,y) has one and 
only one solution containing the point (xq,Dq). 

A stronger condition is that if f(x,y) and df/dy are finite and continuous at (xo,yo), 
then a solution of y' = f(x, y) exists and is unique in the neighborhood of this point. 



I 

Example 2.9 

Analyze the uniqueness of the solution of 

d j- t = -Ky/y, y(T) = 0. (2.146) 

Here, t is the independent variable instead of x. Taking, 

f(t,y)=-Ky/y, (2.147) 

we have 

dy 2^/y 

which is not finite at y = 0. So the solution cannot be guaranteed to be unique. In fact, one solution is 

y (t) = ±K 2 (t-T) 2 . (2.149) 

Another solution which satisfies the initial condition and differential equation is 

y(t)=0. (2.150) 

Obviously the solution is not unique. 

I 



I 

Example 2.10 

Consider the differential equation and initial condition 

^ = 3y 2 / 3 , y(2) = 0. (2.151) 

ax 

On separating variables and integrating, we get 

3y 1/3 = 3x + 3C, (2.152) 



4 Rudolf Otto Sigismund Lipschitz, 1832-1903, German mathematician. 
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-1 
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Figure 2.8: Two solutions y(x) which satisfy y' = Sy 2 ^ 3 with y(2) = 0. 



so that the general solution is 
Applying the initial condition, we find 

However, 

and 

V = 



(2.153) 

(2.154) 
(2.155) 

if*<2. ( 2 - 156 ) 

are also solutions. These singular solutions cannot be obtained from the general solution. However, 
values of y' and y are the same at intersections. Both satisfy the differential equation. The two solutions 
are plotted in Fig. 12.81 

I 



y = (x + Cf 



y=(x-2f 



V = 0, 



(x-2) 3 if x > 2, 



2.9 Clairaut equation 

The solution of a Clairauio equation 

y = xy' + f{y'), 

'Alexis Claude Clairaut, 1713-1765, Parisian/ French mathematician. 



(2.157) 
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can be obtained by letting y' = u(x), so that 

y = xu + f(u). (2.158) 

Differentiating with respect to x, we get 



du 



y ' = XU ' + U + J- U \ (2.159) 



u = xu' + u + ^-u', (2.160) 

du 

x + 4-]u = 0. (2.161) 

du) 

There are two possible solutions to this, u' = or x + df /du = 0. If we consider the first 

and take 

du 
u' = — = 0, (2.162) 

ax 

we can integrate to get 

u = C, (2.163) 

where C is a constant. Then, from Eq. ( 12.1581) . we get the general solution 

y = Cx + f(C). (2.164) 

Applying an initial condition y{x ) = y gives what we will call the regular solution. 

But if we take the second 

df 
x+-j- = 0, (2.165) 

du 

and rearrange to get 

x = -^-, (2.166) 

du 

then Eq. (12.1660 along with the rearranged Eq. ( 12.158ft 

y = -u^- + f(u), (2.167) 

form a set of parametric equations for what we call the singular solution. It is singular 
because the coefficient on the highest derivative in Eq. (12.161J1 is itself 0. 



I 

Example 2.11 

Solve 

y = xy'+ (y'f, y(0) = y . (2.168) 

Take 

u = y'. (2.169) 
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x y o =0 



Figure 2.9: Two solutions y{x) which satisfy y = xy' + (y') 3 with y(0) = y . 



Then 



du 



3w" 



(2.170) 

(2.171) 



specializing Eq. (|2.164[) gives 



y = Cx + C 
as the general solution. Use the initial condition to evaluate C and get the regular solution: 



Vo 
C 

y 



C(0) + C 3 , 

yV\ 

yl /3 x + y - 



(2.172) 
(2.173) 
(2.174) 



1/3 



Note if y 6l\ there are actually three roots for C: C = yj , {—1/2 ± i\/3/2)y . So the solution 
is non-unique. However, if we confine our attention to real valued solutions, there is a unique real 

1/3 

solution, with C = y ■ 

The parametric form of the singular solution is 



y 

X 



Eliminating the parameter u, we obtain 



.'/ 



±2 



-2u d , 

-3u 2 . 



a;\3/2 
3- 



(2.175) 
(2.176) 



(2.177) 



as the explicit form of the singular solution. 

The regular solutions and singular solution are plotted in Fig. 12.91 Note 

• In contrast to solutions for equations linear in y' , the trajectories y(x; y ) cross at numerous locations 
in the x — y plane. This is a consequence of the differential equation's non-linearity 
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• While the singular solution satisfies the differential equation, it satisfies this initial condition only 
when y = 

• For real valued x and y, the singular solution is only valid for x < 0. 

• Because of non-linearity, addition of the regular and singular solutions does not yield a solution to 
the differential equation. 



Problems 

1. Find the general solution of the differential equation 

y + x 2 y(l + y) = 1 + x 3 (l + x). 
Plot solutions for y(0) = -2, 0, 2. 



2. Solve 

Plot a solution for x(0) = 1. 

3. Solve 

4. Solve 



x = 2tx + te ' x 2 . 



3x 2 y 2 dx + 2x 3 y dy = 0. 



dy x-y 



dx x + y 

5. Solve the non-linear equation (y 1 — x)y" + 2y' = 2x. 

6. Solve xy" + 2y' = x. Plot a solution for y(l) = 1, j/(l) = 1- 

7. Solve y" - 2yy' = 0. Plot a solution for j/(0) = 0, y'(0) = 3. 

8. Given that y\ = x~ x is one solution of y" + {2>/x)y' + (l/x 2 )y = 0, find the other solution. 

9. Solve 

(a) y' tany + 2 sinxsin( : | + x) + In a: = 

(b) xy' — 2y — x 4 — y 2 = 

(c) y' cos y cos x + sin y sin a; = 

(d) y' + ycotx = e x 

(e) x 5 y' + y + e x * (x 6 - l)y 3 = 0, with y(l) = e" 1 / 2 

(f) y' + y 2 -xy- 1=0 

(g) 2/(2; + y 2 ) - y = 

^^ » -2x-j/+4 

(i) y' + xy = y 
Plot solutions, when possible, for y(0) = —1,0, 1. 
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10. Find all solutions of 

{x + l)(y') 2 + (x - y)y' - y = 

11. Find an a for which a unique real solution of 

(y') A + 8{y'f + (3a + 16)(y') 2 + 12oj/ + 2a 2 = 0, with y(l) = -2 
exists. Find the solution. 



12. Solve 

13. Find the most general solution to 

14. Solve 



1 , 1 

y - -^y +-y = l 



(y'-l)(l/' + l) = 
(£> -!)(£>- 2)y = x 
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Chapter 3 

Linear ordinary differential equations 



see Kaplan, 9.1-9.4, 

see Lopez, Chapter 5, 

see Bender and Orszag, 1.1-1.5, 

see Riley, Hobson, and Bence, Chapter 13, Chapter 15.6, 

see Friedman, Chapter 3. 

We consider in this chapter linear ordinary differential equations. We will mainly be con- 
cerned with equations which are of second order or higher in a single dependent variable. 

3.1 Linearity and linear independence 

An ordinary differential equation can be written in the form 

Mv) = f(x), (3.1) 

where y(x) is an unknown function. The equation is said to be homogeneous if f(x) = 0, 
giving then 

My) = o. (3.2) 

This is the most common usage for the term "homogeneous." The operator L is composed 
of a combination of derivatives d/dx, d 2 /dx 2 , etc. The operator L is linear if 

L(y 1 + y 2 ) = L(y 1 ) + L(y 2 ), (3.3) 

and 

May) = aL(y), (3.4) 

where a is a scalar. We can contrast this definition of linearity with the definition of more 
general term "affine" given by Eq. (II . 102j) . which, while similar, admits a constant inhomo- 
geneity. 
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For the remainder of this chapter, we will take L to be a linear differential operator. The 
general form of L is 



d N d N ~ 1 

L = P N (x)- ri r + Pn-i(x) 



Pi(x)-^ + P (x). 



(3.5) 



dx N iV " lv J dx N ~ l " 

The ordinary differential equation, Eq. (13.ip . is then linear when L has the form of Eq. ( 13.5ft . 

Definition: The functions yi(x),y 2 (x), . . . ,ijn{x) are said to be linearly independent when 
Ciyi(x) + C 2 y 2 (x) + . . . + C^yN^x) = is true only when C\ = C 2 = . . . = Cn = 0. 

A homogeneous equation of order TV can be shown to have N linearly independent solu- 
tions. These are called complementary functions. If y n (n = 1, . . . , N) are the complementary 
functions of Eq. (13.20 . then 

TV 

y( x ) = ^2 c ny n (x), (3.6) 

n=l 



is the general solution of the homogeneous Eq. (13. 2p . In language to be defined in a future 
chapter, Sec. \7.3\ we can say the complementary functions are linearly independent and span 
the space of solutions of the homogeneous equation; they are the bases of the null space of the 
differential operator L. If y p (x) is any particular solution of Eq. (13.10 . the general solution 
to Eq. ([3J2]) is then 

N 

y(x) = y p (x) + ^2C n y n (x). (3.7) 

n=l 

Now we would like to show that any solution <f>(x) to the homogeneous equation L(j/) = 
can be written as a linear combination of the iV complementary functions y n {x): 

C 1 y 1 {x) + C 2 y 2 (x) + ... + C N y N (x) = <j>{x). (3.8) 

We can form additional equations by taking a series of derivatives up to N — 1: 

C x y[{x) + C 2 y' 2 {x) + ... + C N y' N (x) = <f>'(x), (3.9) 



Cl y{ N - 1] (x) + C 2 yi N ^(x) + ... + C N y$- l \x) 



This is a linear system of algebraic equations: 



/ Vi 

y[ 



y-2 
y'i 



Vn \ 



(N-l) (JV-1) 
'1 i/2 
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c 9 . 



^-^(x). 



Hx) \ 

<f>'(x) 



\vr vs- ••• v { n- 1] J VCat/ \4>v-V(*)/ 



(3.10) 



(3.n; 



3.1. LINEARITY AND LINEAR INDEPENDENCE 
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We could solve Eq. (13. lip by Cramer's rule, which requires the use of determinants. For a 
unique solution, we need the determinant of the coefficient matrix of Eq. (I3.1ip to be non- 
zero. This particular determinant is known as the Wronskiav^ W of Vi(x), U2(x), ■■ ■ , Vn{x) 
and is defined as 



W 






1)2 
V2 



y[ N - 1] yi N - 1] 



Vn 

Vn 



(N-l) 

Vn 



(3.12) 



The condition W ^ indicates linear independence of the functions yi(x),i/2(x), . . . ,i/n(x), 
since if <j)(x) = 0, the only solution is C n = 0, n = 1, . . . , N. Unfortunately, the converse is 
not always true; that is, if W = 0, the complementary functions may or may not be linearly 
dependent, though in most cases W = indeed implies linear dependence. 



I 

Example 3.1 

Determine the linear independence of (a) y\ = x and j/2 = 2x, (b) y\ = x and yi = x 2 , and (c) 
y\ = x 2 and J/2 = x\x\ for x S (— 1, 1). 



(a) W 



(b) W 



X 


2x 


1 


2 


X 


x 2 


1 


2x 



0, linearly dependent. 



x 2 ^ 0, linearly independent, except at x = 0. 



(c) We can restate yi as 



V2(x) 



so that 



Ti- 
ll' 



x 2 


-x 2 




2x 


-2x 


x 2 


x 2 




2x 


2x 





x e (-1,0] 
ze(0,i), 



-2x 3 + 2x 3 = 0, 



e(-i,o], 



2x 3 - 2x 3 = 0, x e (0, 1). 



(3.13) 
(3.14) 



(3.15) 
(3.16) 



Thus, W = for x € (—1, 1), which suggests the functions may be linearly dependent. However, when 
we seek C\ and Ci such that C\y\ + C2J/2 = 0, we find the only solution is C\ = 0, C2 = 0; therefore, 
the functions are in fact linearly independent, despite the fact that W = 0! Let's check this. For 

xe (-1,0], 

Cix 2 + C 2 (-x 2 ) = 0, (3.17) 

so we will need C\ = Ci at a minimum. For x G (0, 1), 



Ci.x 2 + C 2 x 2 



0. 



(3.18) 



which gives the requirement that C\ = — C2. Substituting the first condition into the second gives 
Ci = — C2, which is only satisfied if C2 = 0, thus requiring that C\ = 0; hence, the functions are indeed 
linearly independent. 



1 Jozef Maria Hoene-Wroiiski, 1778-1853, Polish-born French mathematician. 



\CC BY-NC-TW} 29 July 2012, Sen & Powers. 



82 



CHAPTER 3. LINEAR ORDINARY DIFFERENTIAL EQUATIONS 



I 

Example 3.2 

Determine the linear independence of the set of polynomials, 



Vn{x) 



1,X ' 2 ' 6 ' 



JV-l 



(JV-1)! 



(3.19) 



The Wronskian is 



W 



2 6 

1 x ±x 2 

1 x 

1 



(N-l) 



1 x N ~ l 



JW=T) 



(N-3) 



(JV-4) 



„N-2 



c 7V-3 
„JV-4 



1. 



... 1 

The determinant is unity, V-/V. As such, the polynomials are linearly independent. 



(3.20) 



3.2 Complementary functions 

This section will consider solutions to the homogeneous part of the differential equation. 

3.2.1 Equations with constant coefficients 

First consider equations with constant coefficients. 

3.2.1.1 Arbitrary order 

Consider the homogeneous equation with constant coefficients 

A N yW + A N _ lV ( N -V + ... + A lV ' + A y = 0, (3.21) 



where A n , (n = 0, . . . , N) are constants. To find the solution of Eq. ( 13.211) . we let y = e rx . 
Substituting we get 



„N 



A N r n e rx + A N _ ir ( - -e' 



,(N-i) e rx + ... + A 1 r 1 e rx + A e r 
Eliminating the non-zero common factor e rx , we get 



0. 



A N r n + A N _ t r 



(7V-1) 



A x r l + A r c 



N 



J2 A ^ n 



0, 
0. 



(3.22) 

(3.23) 
(3.24) 



n=0 
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This is called the characteristic equation. It is an n th order polynomial which has TV roots 
(some of which could be repeated, some of which could be complex), r n (n = 1, . . . , N) from 
which N linearly independent complementary functions y n (x) (n = 1,...,N) have to be 
obtained. The general solution is then given by Eq. (I3.6p . 

If all roots are real and distinct, then the complementary functions are simply e TnX , 
(n = 1, . . . ,N). If, however, k of these roots are repeated, i.e. r\ = r-i = . . . = r^ = r, 
then the linearly independent complementary functions are obtained by multiplying e rx by 
l,x,x 2 , . . . ,x k ~ 1 . For a pair of complex conjugate roots p ± qi, one can use de Moivre's 
formula (see Appendix, Eq. (11Q.91R ) to show that the complementary functions are e px cos qx 
and e px sinqx. 



I 

Example 3.3 

Solve 

rf 4 w d 3 y d 2 y dy 

d^- 2 d^ + d^ +2 2- 2y =°- {3 ^ 

Substituting y = e rx , we get a characteristic equation 

r 4 -2r 3 + r 2 + 2r - 2 = 0, (3.26) 

which can be factored as 

(r + l)(r- l)(r 2 -2r + 2) = 0, (3.27) 

from which 

ri = -l, r 2 = 1 r 3 = l + i r 4 = 1 - i. (3.28) 

The general solution is 

y{x) = C ie - X + C 2 e x + C' 3 eS 1+l)x + C' 4 e (1 - l)x , (3.29) 

= C 1 e- X + C 2 e x + C' 3 e x e ix +C' 4 e x e~ lx , (3.30) 

= C ie ~ x + C 2 e x + e x (C' 3 e ix + C^e"") , (3.31) 

= de~ x + C 2 e x + e x {C 3 {cosx + ismx) + C' 4 {cos(-x) + ism(-x))), (3.32) 

= Cie~ x + C 2 e x + e x ({C' 3 + C' 4 )cosx + i{C 3 -C' 4 )smx), (3.33) 

y[x) = Cie~ x + C 2 e x + e x (C 3 cosx + C i smx), (3.34) 

where C 3 = C 3 + C 4 and C 4 = i{C 3 - C 4 ). 



3.2.1.2 First order 

The characteristic polynomial of the first order equation 

ay' + by = 0, (3.35) 

is 

ar + b = 0. (3.36) 
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So 

r = --, (3.37) 

a 

thus, the complementary function for Eq. (13.350 is simply 

y = Ce~« x . (3.38) 

3.2.1.3 Second order 

The characteristic polynomial of the second order equation 

a-rr + b— + cy = 0, (3.39) 

dor ax 

is 

ar 2 + for + c = 0. (3.40) 

Depending on the coefficients of this quadratic equation, there are three cases to be consid- 
ered. 

• b 2 — 4ac > 0: two distinct real roots r\ and r 2 . The complementary functions are 
yi = e TlX and y 2 = e T2X , 

• b 2 — 4ac = 0: one real root. The complementary functions are y± = e rx and yi = xe rx , 
or 

• b 2 — 4ac < 0: two complex conjugate roots p ± qi. The complementary functions are 
yi = e px cos qx and y 2 = e px sin qx. 



I 

Example 3.4 
Solve 

d 2 y _ 3 dy_ 
dx 2 dx 



3-£ + 2y = 0. (3.41) 



The characteristic equation is 

r 2 -3r + 2 = 0, (3.42) 

with solutions 

n = 1, r 2 = 2. (3.43) 

The general solution is then 

y = Cie x + C 2 e 2x . (3.44) 



I 

ICC BY-JVC-MXl 29 July 2012, Sen & Powers. 



3.2. COMPLEMENTARY FUNCTIONS 85 



I 

Example 3.5 

Solve 



The characteristic equation is 
with repeated roots 
The general solution is then 



d 2 y 2 dy 
dx 2 dx 



2 2^+^ = 0- (3-45) 



r 2 -2r+l = 0, (3.46) 

ri = 1, r 2 = 1. (3.47) 

y = C ie x + C 2 xe x . (3.48) 

I 



I 

Example 3.6 

Solve 



The characteristic equation is 
with solutions 
The general solution is then 



d 2 y _ 2 dy_ 
dx 2 dx 



3i-2^ + !02/ = 0- (3-49) 



r 2 -2r + 10 = 0, (3.50) 

n = l + 3J, r 2 = l-3i. (3.51) 

J/ = e a; (Cicos3a; + C2sin3a;). (3.52) 

I 



3.2.2 Equations with variable coefficients 
3.2.2.1 One solution to find another 

If yi(x) is a known solution of 

y" + P(x)y' + Q(x)y = 0, (3.53) 

let the other solution be y%{x) = u(x)yi(x). We then form derivatives of y2 and substitute 
into the original differential equation. First compute the derivatives: 

y' 2 = U V\ + u 'vu (3.54) 

2/2 = u v"\ + u 'y'\ + u 'y'\ + u "yii ( 3 - 55 ) 

y'i = uy" + 2u'y[ + u"y 1 . (3.56) 
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Substituting into Eq. (I3.53p . we get 

(uy'l + 2u'y[ + u" yi ) +P(x) (uy[ + u' yi ) +Q(x) uy x = 0, (3.57) 

v'i y' 2 y* 

u"y 1 +u'(2y[ + P(x)y 1 )+u(y'; + P(x)y[ + Q(x)y 1 ) = 0, (3.58) 

S v ' 

=0 

cancel coefficient on u: u"yi + u'(2y' 1 + P(x)yi) = 0. (3.59) 

This can be written as a first-order equation in v, where v = u'\ 

v'y l + v{2y[ + P(x)y l ) = 0, (3.60) 

which is solved for v (x) using known methods for first order equations. 



3.2.2.2 Euler equation 

An equation of the type 



x 2 pi + Ax^- + By = 0, (3.61) 



where A and B are constants, can be solved by a change of independent variables. Let 









z = In x, 






(3.62) 


so that 






x = e z . 






(3.63) 


Then 




dz 
dx 


X 






(3.64) 






dy 

dx 


dy dz - z dy 
dz dx dz ' 


d 

dx 


dz 


(3.65) 






dx 2 


d_ fdy\ 
dx \dx J 

dz \ dz ) 

e -2 Z ffy_ _ ^y\ 

\dz 2 dz J 






(3.66) 
(3.67) 
(3.68) 


Substituting 


into E 


)q. (|3.6 


U), we get 









g + (A-l)g + M (3.69) 



which is an equation with constant coefficients. 
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In what amounts to the same approach, one can alternatively assume a solution of the 
form y = Cx r . This leads to a characteristic polynomial for r of 

r(r -1) + Ar + B = 0. (3.70) 

The two roots for r induce two linearly independent complementary functions. 



I 

Example 3.7 

Solve 

x 2 y" - 2xy' + 2y = 0, for x > 0. (3.71) 

Here A = — 2 and B = 2 in Eq. (|3.6ip . Using this, along with x = e z , we get Eq. (|3.69p to reduce 
to 

d 2 y _ 5 ^v 

dz 2 dz 
The solution is 

y = C 1 e z + C 2 e 2z = C lX + C 2 x 2 . (3.73) 



3-§ - 3^ + 2y = 0. (3.72) 



Note that this equation can also be solved by letting y = Cx r . Substituting into the equation, we get 
r 2 — 3r + 2 = 0, so that n = 1 and r 2 = 2. The solution is then obtained as a linear combination of 

x 7 ' 1 and x T2 . 

I 



I 

Example 3.8 

Solve 



„ 2 d 2 y 3x dy 
dx 2 dx 



x 2 —^ + 3x-f- + 15y=0. (3.74) 



Let us assume here that y = Cx r . Substituting this assumption into Eq. (|3.74[) yields 

x 2 Cr(r - l)x r ~ 2 + ZxCrx 1 *- 1 + 15Cx r = 0. (3.75) 

For x ^ 0, C/0, we divide by Cx r to get 

r(r-l) + 3r + 15 = 0, (3.76) 

r 2 + 2r + 15 = 0. (3.77) 

Solving gives 

r = -l±iVu. (3.78) 

Thus, we see there are two linearly independent complementary functions: 

y{x) = dx- 1+tVU + C 2 a;- 1 - ?VTi . (3.79) 

Factoring gives 

y (x) = - (C x x is/U + C 2 a;- lv/Ii ) . (3.80) 
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Expanding in terms of exponentials and logarithms gives 

y(x) = i(Ci(exp(lna;)) iyTi + C 2 (exp(lnx))- l ^ n ), (3.81) 

= - (c 1 exp(iVu\nx)+C 2 exp{iVu\nx)) , (3.82) 

= - (Ci cos(VT41nai) + C 2 sin(VlIlna;)) . (3.83) 



3.3 Particular solutions 

We will now consider particular solutions of the inhomogeneous Eq. (13.11) . 

3.3.1 Method of undetermined coefficients 

Guess a solution with unknown coefficients, and then substitute in the equation to determine 
these coefficients. The number of undetermined coefficients has no relation to the order of 
the differential equation. 



I 

Example 3.9 

Consider 



Thus 



y" + 4y' + 4y = 169 sin 3a;. (3.84) 



r 2 +4r + 4 = 0, (3.85) 

(r + 2)(r + 2) = 0, (3.86) 

n = -2, r 2 = -2. (3.87) 

Since the roots are repeated, the complementary functions are 

Vi = e~ 2x , 2/2 = xe~ 2x . (3.88) 

For the particular function, guess 

y p = a sin 3a; + b cos 3x, (3.89) 



so 



y' = 3a cos 3a; — 3bsin3x, (3.90) 

y p = —9a sin 3a; — 96 cos 3a;. (3.91) 
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Substituting into Eq. p.84[) . we get 

(—9a sin 3a; — 96 cos 3x) +4 (3a cos 3a; — 36 sin 3x) +4 (a sin 3a; + 6 cos 3a;) = 169 sin 3x, (3.92) 

v . ' v . ' v . ' 

y'p y' P vp 

(-5a- 126) sin 3a; + (12a- 56) cos 3a; = 169 sin 3a;, (3.93) 

(-5a- 126- 169) sin 3x + (12a -56) cos 3.x = 0. (3.94) 



Now sine and cosine can be shown to be linearly independent. Because of this, since the right hand 
side of Eq. (|3.94|) is zero, the constants on the sine and cosine functions must also be zero. This yields 
the simple system of linear algebraic equations 



-5 -12 N f a\ _ / 169\ 
12 -5 I { b ) ~ I ) 



(3.95) 



we find that a = — 5 and 6 = —12. The solution is then 

y[x) = (Ci + C 2 x)e~ 2x - 5 sin 3a; - 12 cos 3x. (3.96) 



I 

Example 3.10 

Solve 



2y'" + y" + 2y' -2y = x 2 +x + l. (3.97) 



Let the particular integral be of the form y p = ax 2 + bx + c. Substituting and reducing, we get 

-(2a + 1) x 2 + (4a - 26 - 1) x + (2a + 26 - 2c - 1) = 0. (3.98) 

=0 =0 =0 

Since x 2 , x 1 and a; are linearly independent, their coefficients in Eq. (|3.98p must be zero, from which 
a = -1/2, 6 = -3/2, and c = -5/2. Thus, 

y P = --(x 2 + 3x + 5). (3.99) 

The solution of the homogeneous equation was found in a previous example, see Eq. (|3.34[) , so that the 
general solution is 

y = de~ x + C 2 e x + e x (C 3 cosa; + C 4 sinx) - -(x 2 + 3x + 5). (3.100) 
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A variant must be attempted if any term of f{x) is a complementary function. 



I 

Example 3.11 

Solve 

y"+4y = 6 sin 2a;. (3.101) 

Since sin 2x is a complementary function, we will try 

y p = x(asm2x + bcos2x), (3.102) 

from which 

y' p = 2x(acos2x — 6sin2x) + (asin2:r + 6cos2x), (3.103) 

y" = -Ax(asm2x + bcos2x) + 4(acos2x -bsm2x). (3.104) 

Substituting into Eq. (|3.10ip . we compare coefficients and get a = 0, b = —3/2. The general 

solution is then 

3 

y = C\ sin 2x + Ci cos 2x x cos 2x. (3.105) 



I 

Example 3.12 

Solve 



y" + 2y' + y = xe~ x . (3.106) 



The complementary functions are e~ x and xe~ x . To get the particular solution we have to choose 
a function of the kind y p = ax 3 e~ x . On substitution we find that a = 1/6. Thus, the general solution 
is 

y = C ie - X + C 2 xe~ x + -x 3 e~ x . (3.107) 



3.3.2 Variation of parameters 

For an equation of the class 

P N (x)yW + P N - 1 (x)y^ N - 1) + ... + P 1 (x)y' + P (x)y = f(x), (3.108) 

we propose 

N 

Vp = ^2 u n(x)y n (x), (3.109) 



n=l 
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where y n (x), (n = 1, . . . , N) are complementary functions of the equation, and u n (x), (n = 
1, . . . , N) are N unknown functions. Differentiating Eq. (13.1090 . we find 

N N 

V'p= ^2^" +J2 Un y'n- (3.110) 

n=l n=l 

choose to be 

We set ^2 n= iu' n y n to zero as a first condition. Differentiating the rest of Eq. (13.1100 . we 
obtain 

N N 

y'p= J2 U 'ny'n +J2 Uny n- (3.111) 

n=l n=l 

choose to be 

Again we set the first term on the right side of Eq. (13. 11 1|) to zero as a second condition. 
Following this procedure repeatedly we arrive at 

N N 

vr- i) =E<^" 2) +E«»^* i) - ( 3 - n2 ) 

n—l n=l 



choose to be 



The vanishing of the first term on the right gives us the (N — l)'th condition. Substituting 
these into Eq. (j3.108p . the last condition 



N N 



P N ( X ) E^f "^ + E^ ( P ^n N) + PN-W^ + ... + PlV' n + PoVn) = f(x), (3.113) 



n=l n=l 



is obtained. Since each of the functions y n is a complementary function, the term within 
brackets is zero. 

To summarize, we have the following N equations in the TV unknowns u' n , (n = 1, . . . , N) 
that we have obtained: 



A? 



E n '^ = 


= o, 


n=l 




TV 




J2 U 'ny'n = 


= o, 



n=l 

: (3.114) 



N 



E<^~ 2) = o. 

n=l 

N 
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These can be solved for u' n , and then integrated to give the u n 's. 



I 

Example 3.13 

Solve 

y " + y = ta,nx. (3.115) 

The complementary functions are 

2/1 = cosx, 2/2 = sinx. (3.116) 

The equations for U\(x) and U2{x) are 

u[yi+u' 2 y2 = 0, (3.117) 

u' 1 y[ + u 2 y 2 = tanx. (3.118) 

Solving this system, which is linear in u[ and u 2 , we get 



— sinxtanx, (3.119) 

cosx tan a;. (3.120) 



Integrating, we get 



sin x tan x dx = sinx — In | secx + tanx|, (3.121) 

u 2 = I cosx tan x dx = — cosx. (3.122) 



The particular solution is 

y P = uiy 1 +u 2 y 2 , (3.123) 

= (sinx — In | secx + tanx|) cosx — cosxsinx, (3.124) 

= — cosx In | secx + tan x|. (3.125) 

The complete solution, obtained by adding the complementary and particular, is 

y = C\ cosx + C 2 sinx — cosxln | secx + tanx|. (3.126) 



3.3.3 Green's functions 

A similar goal can be achieved for boundary value problems involving a more general linear 
operator L, where L is given by Eq. (13.51) . If on the closed interval a < x < b we have a two 
point boundary problem for a general linear differential equation of the form: 

Ly = f{x), (3.127) 
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where the highest derivative in L is order N and with general homogeneous boundary con- 
ditions at x = a and x = b on linear combinations of y and N — 1 of its derivatives: 

A(y(a),y'(a),---,2/ (7V - 1) («)) r + B(y(6),y / (fe),...,2/ (7V - 1) (fe)) T = 0, (3.128) 

where A and B are N x N constant coefficient matrices. Then, knowing L, A and B, we 
can form a solution of the form: 

V(x)= I f(s)g(x,s)ds. (3.129) 

J a 

This is desirable as 

• once g(x, s) is known, the solution is defined for all f including 

— forms of / for which no simple explicit integrals can be written, and 

— piecewise continuous forms of /, 

• numerical solution of the quadrature problem is more robust than direct numerical 
solution of the original differential equation, 

• the solution will automatically satisfy all boundary conditions, and 

• the solution is useful in experiments in which the system dynamics are well charac- 
terized (e.g. mass-spring-damper) but the forcing may be erratic (perhaps digitally 
specified). 

If the boundary conditions are inhomogeneous, a simple transformation of the dependent 
variables can be effected to render the boundary conditions to be homogeneous. 

We now define the Green'so function: g(x, s) and proceed to show that with this definition, 
we are guaranteed to achieve the solution to the differential equation in the desired form as 
shown at the beginning of the section. We take g(x, s) to be the Green's function for the 
linear differential operator L, as defined by Eq. (13.50 . if it satisfies the following conditions: 

• Lg(x, s) = S(x — s), 

• g(x, s) satisfies all boundary conditions given on x, 

• g(x, s) is a solution of Lg = on a < x < s and on s < x < b, 

• g(x, s), g'(x, s), . . . ,g^ N ~ 2 \x, s) are continuous for x G [a, b], 

• 5 ,( - 7V_1 ^( ;;c 5 s ) is continuous for [a, b] except at x = s where it has a jump of l/P/v(s); the 
jump is defined from left to right. 



2 George Green : 1793-1841, English corn-miller and mathematician of humble origin and uncertain edu- 
cation, though he generated modern mathematics of the first rank. 
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Also for purposes of these conditions, s is thought of as a constant parameter. In the actual 
Green's function representation of the solution, s is a dummy variable. The Dirac delta 
function S(x — s) is discussed in the Appendix, Sec. I10.7.10|, and in Sec. 7.20 in Kaplan. 

These conditions are not all independent; nor is the dependence obvious. Consider for 
example, 



«(*)£ + *<*)! + *.<*> 



Then we have 



P ^0 + P ^I + P ^ 



d 2 g , P^dg , P (x) 



;9 



5(x — s), 
6(x — s) 



dx 2 P 2 (x)dx P 2 (xY P 2 {x / 

Now integrate both sides with respect to x in a small neighborhood enveloping x = s 



(3.130) 

(3.131) 
(3.132) 



he j2 



<Pg_ 
dx 2 



dx 



Pi{x)dg 
P 2 (x) dx 



dx 



Po(x) 
P*(x) 



g dx 



8{x 



Pi(x) 



dx. (3.133) 



Since P's are continuous, as we let e — > we get 



"*»&.■«« 



dx 2 



Integrating, we find 



pm 



' +< ds dx ■ p °(-'» 



dx 



P*{s) 



g dx 



Pi(s) 



s+e 



S(x — s) dx. 

(3.134) 



dg_ 
dx 



dg_ 
dx 



Pi(s) 

P2(S) 



(g\.+e-9\.- e ) + 



Pais) 

P2(S) 



1 



9dX = p77\ H ^ X ~ S )\s-e- 

^ 2 {S) v v / 



Since g is continuous, this reduces to 

dg_ dg_ 

dx , , rfx 



P 2 s 



(3.135) 



(3.136) 



This is consistent with the final point, that the second highest derivative of g suffers a jump 
at x = s. 

Next, we show that applying this definition of g(x, s) to our desired result lets us recover 
the original differential equation, rendering g(x, s) to be appropriately defined. This can be 
easily shown by direct substitution: 

y( x ) = / f(s)g(x,s)ds, (3.137) 



14/ 



f(s)g(x,s)ds. 



(3.138) 
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Now L behaves as d N /dx N , via Leibniz's rule, Eq. (11.293ft 

Ly = J f{s)Lg{x,s)ds, (3.139) 

S(x— s) 

= / f(s)5(x-s)ds, (3.140) 

= f\x). (3.141) 



I 

Example 3.14 

Find the Green's function and the corresponding solution integral of the differential equation 

S = /(*). (3-142) 

subject to boundary conditions 

2/(0) = 0, y(l) = 0. (3.143) 

Verify the solution integral if f{x) = 6x. 
Here 

L = ^ ^ 

Now 1) break the problem up into two domains: a) x < s, b) x > s, 2) Solve Lp = in both domains; 
four constants arise, 3) Use boundary conditions for two constants, 4) use conditions at x = s: continuity 
of g and a jump of dg/dx, for the other two constants, 
a) x < s 

(3.145) 

(3.146) 

(3.147) 
(3.148) 
(3.149) 
(3.150) 



b) x > s 



d 2 g 
dx 2 


= o, 


dg 
dx 


= c u 


!J 


= C1X + C2, 


9(0) 


= = d(0) + C 2 , 


c 2 


= 0, 


g{x,s) 


= C\x, x < s. 


dx 2 


0. 


dg 
dx 


c 3 , 


9 = 


C 3 x + C 4 , 


5(1) = 


= C 3 (1) + C 4 , 


C 4 = 


-c s , 


g{x,s) = 


C3 (x — 1) , x > s 



(3.151) 

(3.152) 

(3.153) 
(3.154) 
(3.155) 
(3.156) 
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(3.157) 

(3.158) 



Continuity of g(x, s) when 


x = s: 








C lS = C 3 (s 


-1), 




Ci = c 3 — 

s 


1 

1 




g(x,s) = C 3 

s 


1 

X, 


x < s, 


g(x,s) = C 3 (x 


-1), 


x > s. 


Jump in dg/dx at x = s (note P2(x) = 1): 






dg_ 
dx 


dg_ 

s+e dx 


s—e 


1, 




C 3 C 3 — 

s 


1, 




c 3 = 


5, 




g(x,s) = 


x(s - 


- 1), x < s, 




g{x,s) = 


s(x - 


- 1), x > s. 



(3.159) 
(3.160) 

(3.161) 

(3.162) 

(3.163) 
(3.164) 
(3.165) 

Note some properties of g(x, s) which are common in such problems: 

• it is broken into two domains, 

• it is continuous in and through both domains, 

• its N — 1 (here N = 2, so first) derivative is discontinuous at x — s, 

• it is symmetric in s and x across the two domains, and 

• it is seen by inspection to satisfy both boundary conditions. 

The general solution in integral form can be written by breaking the integral into two pieces as 

y(x) = / f(s) s(x - 1) ds + / /(a) x(s - 1) ds, (3.166) 

JO Jx 

= (x-1) J f(s)sds + x[ f(s)(s-l)ds. (3.167) 

JO Jx 

Now evaluate the integral if f(x) = 6x (thus /(s) = 6s). 

y(x) = (x-1) {6s)sds + x (6s) (s - 1) ds, (3.168) 

./o Jx 

2 



= {x-1) 6s 2 ds + x (6s 2 -6s)ds, (3.169) 

Jo Jx 

= (x-l)(2s 3 )|p+o ; (2s 3 -3s 2 )|^ (3.170) 

= (x- l)(2x 3 -0) + z((2-3)- (2x 3 -3x 2 )), (3.171) 

= 2a; 4 -2a; 3 -x-2a; 4 + 3a; 3 , (3.172) 

y(x) = x 3 - x. (3.173) 

Note the original differential equation and both boundary conditions are automatically satisfied by the 
solution. The solution is plotted in Fig. 13.11 

I 
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y" = 6x, y(0) = 0, y(1) =0 




y(x) = x - x 
in domain of interest < x < 1 




-1.5 

y(x) = x"- x 
in expanded domain, -2 < x < 2 



Figure 3.1: Sketch of problem solution, y" = 6x,y(0) = y(l) = 0. 



3.3.4 Operator D 

The linear operator D is defined by 



or, in terms of the operator alone, 



D(y) 



D 



dx 



d 

dx 



The operator can be repeatedly applied, so that 



D n (i/) 



dx n ' 



Another example of its use is 



(D - a)(D - b)f(x) (D - o)((D - b)f(x)), 

df 



= — -{a + b)— + abf. 

Negative powers of D are related to integrals. This comes from 

dy(x) 



dx 



f(x) y(x ) = y , 



(3.174) 
(3.175) 

(3.176) 

(3.177) 
(3.178) 

(3.179) 



(3.180) 
f(s) ds, (3.181) 
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then 



substituting: D(j/(x)) = f(x), (3.182) 

apply inverse: D _1 (D(3/(z))) = D" 1 (/(a;)), (3.183) 

y(x) = D- 1 (/(x)), (3.184) 

= y + J f(s)ds, (3.185) 

J x 

so D" 1 = y + (...) ds. (3.186) 

J x„ 



We can evaluate h(x) where 



h(x) = /(x), (3.187) 

D — a 



in the following way 



(D-a)h(x) = (D - o) f ^— -/(x) J , (3.188) 

(D-a)/i(x) = /(x), (3.189) 

d/i(x} 



a/i(x) = /Or), (3.190) 

e - ax dhjx)_ _ ae -ax h ^ = /( x ) e -« (3.191) 



d.v 



jL(e-*h{x)) = f(x)e- ax , (3.192) 

A( e -«/»(a)) = /( a ) e —, (3.193) 

/ 7~ ( e ~ aS/i ( s )) ds = f(s)e~ as ds, (3.194) 

/* EC 

e- aa; /i(x) - e~ ax °h(x ) = / /(s)e" as ds, (3.195) 

J x 

h(x) = e a{x - Xo) h{x )+e ax f(s)e- a " ds, (3.196) 

J x 

_J_/( X ) = e a{x - Xo) h(x ) + e ax I f{s)e- as ds. (3.197) 

D - a 7 Xo 

This gives us /i(x) explicitly in terms of the known function / such that h satisfies D(h)—ah = 

/■ 

We can find the solution to higher order equations such as 

(D-o)(D-%(x) = f(x), y(x )=y ,y'(x )=y' , (3.198) 
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(B-b)y(x) = — !— /(x), (3.199) 
D — a 

(T>-b)y(x) = h(x), (3.200) 

PX 

y (x) = y e b{x - Xo) + e bx I h{s) e - bs ds. (3.201) 

J Xn 



Note that 



dy 
dx 



be b(x- Xo ) + h ^ + b( ,bx f h ^ e ~bs dg ^ ( 3202 ) 

J x 

-—(x ) = y' = y b + h(x ), (3.203) 

ax 

which can be rewritten as 

(p-b)(y(x )) = h(x ), (3.204) 

which is what one would expect. 

Returning to the problem at hand, we take our expression for h(x), evaluate it at x = s 
and substitute into the expression for y(x) to get 

y(x) = y e b{x - Xo) + e bx f (h(x )e a{s - Xo) + e as f f(t)e~ at dt) e~ bs ds, (3.205) 

J x \ J x a J 

= y e b{x - Xo) + e bx f ({y' - y b) e^"^ + e as f f(t) e - at dt) e~ bs ds, (3.206) 

J x \ J x J 

= y e b{x - Xo) + e bx f Uy' - Vo b) e ( a - b >- ax ° + e {a ~ b)s f f(t) e - at dt) ds, (3.207) 

I* X f* X / fS \ 

= y e b{x - Xo) + e bx (y' - y b) I e {a - b)s ~ ax ° ds + e bx I e {a ~ b)s I / f(t)e- at dt)ds, 

J x J x \J x / 

(3.208) 

a(x—x )—xb —bx rx / rs \ 

= y e b{x - Xa) + e bx (y' -y b) — + e bx J e^ b > U f(t) e - at dtjds, 

(3.209) 

a(x-x ) _ b(x-x„) rx / rs \ 

= y e* x -*J + (y' a - y b) + e bx / e^ b > / f{t)e~ at dt ds, 

a-b J Xo \J Xo J 

(3.210) 

p a(x-x ) _ p b{x-x ) rx rs 

= y oe Kz-*o) + {y i o _ yob) + e bx / e (a-b)s me -at dt ^ ( g 2n) 

Changing the order of integration and integrating on s, we get 

a(x-x a ) _ b(x-x a ) rx rx 

y(x) = y e b ^ x - x ^ + (y' -y b) + e bx / / e^ s f{t)e~ at ds dt, 

a ~ b Jx Jt 
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(3.212) 



a(x— Xo) pb{x—x ) 

- b 



Vob) °—4- ! / f(t)e~ at (J 

(3.213) 



p a(x-x ) _ p b(x-x ) rx rr,\ 

y e h{x - Xa) + (y' - y b) \ + / ^\ {e<^ - e^) dt. 



a — b J„ a — b 

(3.214) 

Thus, we have a solution to the second order linear differential equation with constant 
coefficients and arbitrary forcing expressed in integral form. A similar alternate expression 
can be developed when a = b. 

Problems 

1. Find the general solution of the differential equation 

y' + x 2 y{\ + y) = l + x 3 {l + x). 

2. Show that the functions y\ = sinx, ?/2 = xcosa;, and 2/3=2; are linearly independent. Find the lowest 
order differential equation of which they are the complementary functions. 

3. Solve the following initial value problem for (a) C = 6, (b) C = 4, and (c) C = 3 with j/(0) = 1 and 
y'(0) = -3. 

d 2 y „dy 

Plot your results. 

4. Solve 

(a) &-3^ + 4j/ = 0, 

(b) &-5& + ll&-7U = 12, 

(c) y" + 2y = 6e x + cos 2x, 

(d) x 2 y" — 3xy' — 5y = x 2 log.x, 

(e) -j-jf + y = 2e x cos a; + (e x — 2) sin a;. 

5. Find a particular solution to the following ODE using (a) variation of parameters and (b) undetermined 
coefficients. 

dx 2 

6. Solve the boundary value problem 

d 2 y dy_ _ 
dx 2 dx 
with boundary conditions y(0) = and y(w/2) = — 1 Plot your result. 

7. Solve 

, d 3 y d 2 y dy 

2x 2 — 4 + 2x— 4 - 8-^- = 1, 



4w = cosh 2x. 

r-2 y 



dx 3 dx 2 dx 

with 2/(1) = 4, j/(l) = 8, j/(2) = 11. Plot your result 
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Solve 



9. Find the general solution of 



10. Find the Green's function solution of 



x 2 y" + xy' — Ay = &x. 



y" + 2y' +y = xe~ x . 



y" + y' - 2y = fix), 

with 2/(0) = 0, y'(l) = 0. Determine y(x) if f{x) = 3sinx. Plot your result. 

11. Find the Green's function solution of 

y" + 4y = fix), 

with y(0) = y(l), 2/'(0) = 0. Verify this is the correct solution when fix) = x 2 . Plot your result. 

12. Solve 2/"' - 2y" - y' + 2y = sin 2 x. 

13. Solve y"' + 6y" + 12y' + 8y = e x - 3 sin x - 8e~ 2x . 

14. Solve x 4 y"" + 7x 3 y'" + 8x 2 y" = 4x" 3 . 

15. Show that x~ x and x 5 are solutions of the equation 



Thus, find the general solution of 
16. Solve the equation 

where x > 0. 



x y" — 3xy' — by = 0. 



x y" — 3xy' — by = x . 



22/" - 42/' + 22/ 
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Chapter 4 

Series solution methods 



see Kaplan, Chapter 6, 

see Hinch, Chapters 1, 2, 5, 6, 7, 

see Bender and Orszag, 

see Kervorkian and \Cole\ 

see Van Dyke, 

see Murdock, 

see Holmes, 

see Lopez, Chapters 7-11, 14, 

see Riley, Hobson, and Bence, Chapter 14- 

This chapter will deal with series solution methods. Such methods are useful in solving both 
algebraic and differential equations. The first method is formally exact in that an infinite 
number of terms can often be shown to have absolute and uniform convergence properties. 
The second method, asymptotic series solutions, is less rigorous in that convergence is not 
always guaranteed; in fact convergence is rarely examined because the problems tend to 
be intractable. Still asymptotic methods will be seen to be quite useful in interpreting the 
results of highly non-linear equations in local domains. 

4.1 Power series 

Solutions to many differential equations cannot be found in a closed form solution expressed 
for instance in terms of polynomials and transcendental functions such as sin and cos. Often, 
instead, the solutions can be expressed as an infinite series of polynomials. It is desirable 
to get a complete expression for the n th term of the series so that one can make statements 
regarding absolute and uniform convergence of the series. Such solutions are approximate 
in that if one uses a finite number of terms to represent the solution, there is a truncation 
error. Formally though, for series which converge, an infinite number of terms gives a true 
representation of the actual solution, and hence the method is exact. 

A function f{x) is said to be analytic if it is an infinitely differentiable function such that 
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the Taylor series, J^^Lo f^ n \ x o){ x ~ x o) n /n\, at any point x = x in its domain converges to 
f(x) in a neighborhood of x = x . 



4.1.1 First-order equation 

An equation of the form 



^ + P(x)y = Q(x), (4.1) 



where P{x) and Q(x) are analytic at x = a, has a power series solution 



y{x) = ^2a n {x-a) n , (4.2) 

n=U 



around this point. 



r 



Example 4-1 

Find the power series solution of 



j- x =y 1/(0)= l/o, (4-3) 



around x = 0. 
Let 



y = a + OiX + a 2 x + a 3 x + • ■ ■ , (4.4) 

so that 

-^- = ai + 2a 2 x + 3a 3 x 2 + Acnx 3 ^ . (4.5) 

ax 

Substituting into Eq. (|4.3j) . we have 

a\ + 2a 2 x + 3a 3 x + 4a±x + ■ ■ ■ = ao + a\x + a 2 x + a 3 x +■■■, (4.6) 

" * ' " ^v ' 

dy/dx y 

(oi - a ) + (2a 2 - ai)cc+ (3a 3 - a 2 )x 2 + (4a 4 - a 3 )x 3 H = (4.7) 

=0 =0 =0 =0 

Because the polynomials a; , x 1 , x 2 , . . . are linearly independent, the coefficients must be all zero. Thus, 

(4.8) 
(4.9) 

(4.10) 

(4.11) 



Ol 




ao, 

1 1 


02 




2 ai = 2°°' 
1 1 


03 




3 a2 = 3! a °' 
1 1 


(7 i 


= 


7 a 3 = TTao, 



so that 



1 + ' T + |[ + |[ + ¥ + '" 
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-1.5 *■" -1 -0.5 



0.5 1 1.5 



Figure 4.1: Comparison of truncated series and exact solutions. 



Applying the initial condition at x = gives ao = y Q so 



y (x) = yo\ 1 + x + ^ + ^ + ^ 



(4.13) 



Of course this power series is the Taylor series expansion, see Sec. 110.11 of the closed form solution 
V = yo& x about x = 0. The power series solution about a different point will give a different solution. 

For y = 1 the exact solution and three approximations to the exact solution are shown in Figure [4~Tl 
Alternatively, one can use a compact summation notation. Thus, 



y 

dx 



m = n — 1 



/ @>nX , 
?i=0 
oo 

y^nQnZ"" 1 , 

?i=0 
oo 

y, na„x" _1 , 

oo 

^(m+ l)a m+ ia; r 

m— 

OO 

^(n + l)a n+ ia;™. 



n=0 



Thus, the differential equation becomes 

oo 

J^(n+ l)a n+ icc" 



/ Q, n X , 



OO 

E((n + l)o„+i - a„)a;™ 
v. > 



0. 



n=0 



=0 



(n+ l)a„+i 

O-n+l 



1 



ra+ 1 



&?l? 



(4.14) 
(4.15) 
(4.16) 
(4.17) 
(4.18) 

(4.19) 

(4.20) 

(4.21) 
(4.22) 
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n! ' 






EX 
-r- 
n! 

n=0 



The ratio test tells us that 



lim 

n — >oo 



fln+1 



n + 1 



(4.23) 
(4.24) 

(4.25) 
(4.26) 



so the series converges absolutely. 

If a series is uniformly convergent in a domain, it converges at the same rate for all x in that 
domain. We can use the Weierstrasqj M-test for uniform convergence. That is for a series 



71 = 

to be convergent, we need a convergent series of constants M n to exist 

oo 
J2 M n, 

such that 

\u n (x)\ < M n , 

for all x in the domain. For our problem, we take x € [—A, A], where A > 0. 
So for uniform convergence we must have 

x' 



<M n . 



So take 



(Note M n is thus strictly positive). So 



4" 



??■! 



E M « = E 



n=0 



n=0 



By the ratio test, this is convergent if 



A™" 1 



lim 

n — >oo 



lim 

n — >oo 



(n+1)! 



A" 



n+1 



< 1, 



< 1. 



This holds for all A, so for x G (— oo,oo) the series converges absolutely and uniformly. 



(4.27) 

(4.28) 
(4.29) 

(4.30) 
(4.31) 

(4.32) 



(4.33) 

(4.34) 



1 Karl Theodor Wilhelm Weierstrass, 1815-1897, Westphalia-born German mathematician. 
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4.1.2 Second-order equation 

We consider series solutions of 

P{*)Pi + Q(x)%- + R(*)y = 0, (4.35) 

around x = a. There are three different cases, depending of the behavior of P(a), Q(a) 
and R(a), in which x = a is classified as an ordinary point, a regular singular point, or an 
irregular singular point. These are described next. 

4.1.2.1 Ordinary point 

If P(a) 7^ and Q/P, R/P are analytic at x — a, this point is called an ordinary point. The 
general solution is y = C\y\{x) + C 2 y2(x) where y\ and y 2 are of the form J2^=o a n( x — a ) n - 
The radius of convergence of the series is the distance to the nearest complex singularity, 
i.e. the distance between x = a and the closest point on the complex plane at which Q/P 
or R/P is not analytic. 



I 

Example 4-2 

Find the series solution of 

y" + xy' + y = 0, y{0) = Vo , y'(0)=y' o , (4.36) 

around x = 0. 

The point x = is an ordinary point, so that we have 

oo 

y = 5>„x", (4.37) 

71 = 
OO 

y' = J2 na nX n ~\ (4-38) 

ra=l 
oo 

xy' = ^2na n x n , (4.39) 

n=l 

oo 

xy' = ^2na n x n , (4.40) 

n=0 
oo 

y" = J2n(n-l)a n x n - 2 , (4.41) 

oo 

m = n-2, y" = ^ (m + l)(m + 2)a m+2 a; m , (4.42) 

ro=0 
oo 

Y,(n + l)(n + 2)a n+2 x n . (4.43) 
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Substituting into Eq. (|4.36p , we get 

V ({n + l)(n + 2)a n+2 + na n + a n ) x n = 0. (4.44) 

™=°' 3 ' 

Equating the coefficients to zero, we get 



a n +2 = ~r; a n, (4.45) 

n + 2 



so that 



2 4 

ar a; 



w = an 1 1 h ■ • ■ I + a-\ \ x 

2 4-2 6-4-2 



II 



( x 2 x 4 x 6 \ . ( 

y °( v 1 -Y + 4-2-6^2 + --j +2/o r 



(-1)" 2 „ , , y (-^"-^"n! 2n _! 

9 \ n / oo , 

z \ 11' « — , n.i 



n=0 n=l v ' 



X 3 X 5 

T + 5~3 ~ 

X X 

T + ~5~3 ~ 


X7 1 "i 


(4.46) 
(4.47) 


7-5-3 )' 

x 7 \ 
7-5-3 + J' 






(4.48) 






(4.49) 



n=0 v 7 n=l v ' 

The series converges for all x. For y = l,y' = the exact solution, which can be shown to be 

y = exp(~\ (4-50) 

and two approximations to the exact solution are shown in Fig. 14.21 For arbitrary y and y' a , the 
solution can be shown to be 

X 2 \ ( pK i „ ( X \ 



y=ex H"TJr + V2^ erfi l7iJJ- (4 - 51) 

Here "erfi" is the so-called imaginary error function; see Sec. 110.7.41 of the Appendix. 



4.1.2.2 Regular singular point 

If P(a) = 0, then x = a is a singular point. Furthermore, if (a; — a)Q/P and (x — a) 2 R/P 
are both analytic at x = a, this point is called a regular singular point. Then there exists at 
least one solution of the form 

oo oo 

y(x) = (x- a) r ]T a n (x -a) n = J2 «n(x - a) n+r . (4.52) 

n=0 n=0 

This is known as the Frobeniuqj method. The radius of convergence of the series is again 
the distance to the nearest complex singularity. 

An equation for r is called the indicia! equation. The following are the different kinds of 
solutions of the indicial equation possible: 



2 Ferdinand Georg Frobenius, 1849-1917, Prussian/German mathematician. 
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y" + xy' + y = 0, y (0) = 1 , y' (0) = 

Y 




y = 1 -x 2 /2 + x 4 /8 



y = exp (- x 12) (exact) 



Vi 



1)2 



x y = 1 - x 2 /2 
Figure 4.2: Comparison of truncated series and exact solutions. 

7*1 7^ r 2 , and 7"i — r 2 not an integer. Then 

oo oo 

(x - a) ri ^^ a n(x - a) n = ^] a n (x - a) n+ri , 

ra=0 n=0 

oo oo 

(x - a) r *J2 b n(x ~ «)" = !>»(* " «) n+r2 - 

n=0 n=0 

T\ = T2 = 7*. Then 

oo oo 

7/i = (rr — a) r / ' a n (x — a) n = } a n (x — a) w+r , 

ra=0 n=0 

oo oo 

7/2 = 7/i In x + (x - a) r y~] b n (x - a) n = yilnx + y^ j b n (x - a] 

n=0 

r\ 7^ r 2 , and r x — r 2 is a positive integer. 

oo oo 

7/1 = (x - a) Tl y^ a n (x - g) n = y~] g n (x - a] 



n+r 



n=0 



\n+ri 



n=0 



n=0 



(4.53) 
(4.54) 



(4.55) 
(4.56) 



(4.57) 



7/ 2 = fc7/ilnx + (x-a) r2 ^6 n (x-a) n = %ilnx + J]6 n (x-a) n+r2 . (4.58) 



n=0 



ra=0 



The constants a n and k are determined by the differential equation. The general solution is 

y{x) = C iyi (x) + C 2 y 2 (x). (4.59) 
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I 

Example 4-3 

Find the series solution of 

4xy" + 2y' + y = 0, (4.60) 

around x = 0. 

The point x = is a regular singular point. So we have a = and take 

oo 

y = x r y^ j a n x n , 

oo 
oo 

y' = ^a n (n + r).T"+'- 1 , 



y" = 5>„(n + r)(n + r-l).T™+'- 2 , 

?i=0 



4^a„(n + r)( I i + r- l)a: n+r - 1 + 2 ^ a„(n + r)x" +r - 1 + ^ a„x^ 



n— n— n— 



=4xy" =2y> =y 

oo oo 

2 Y, a n(n + r)(2n + 2r - l)/^" 1 + ^ a„2)™ +r 



n=0 n=0 

+r 
m— — 1 n— 

+r 



1 2 ^ a m+ i(m+l + r)(2(m + l) + 2r-l)a: m+r + ^a„a; Tl 

n— — 1 n— 

oo oo 

2 ^ a n+1 (n + 1 + r)(2(n + 1) + 2r - l)a; n+r + ^ a„2)" 

n— — 1 n— 

oo oo 

2a r(2r - l)x~ 1+r + 2 ^ a n+1 {n + 1 + r)(2(n + 1) + 2r - l):r n + r + ^ a„x n+r 

n=a n=0 

The first term (n = —1) gives the indicial equation: 

r(2r- 1) = 0, 
from which r = 0, 1/2. We then have 

oo oo 



2Y,a n + 1 (n + r+l)(2n + 2r+l)x n+r + Y a nX n+r = 0, 

?i=0 ?i=0 

oo 

Y {2a n+1 (n + r + l)(2n + 2r + 1) + a„) : 



T1=0 * 



For r = 



; (2n + 2)(2n+l)' 

X X X 



Vl a °l 1 -2! + ¥"6! 



4.61) 
4.62) 
4.63) 
4.64) 

4.65) 

4.66) 
4.67) 
4.68) 
4.69) 

4.70) 

4.71) 
4.72) 

4.73) 
4.74) 
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1/2, 



y = cos (x ) (exact) 



100 




4 x y" + 2 y' + y = 

y (0) = 1 
y ' (0) < oo 

Figure 4.3: Comparison of truncated series and exact solutions. 



i 

» y= 1 -x/2 

\ 
\ 



For r = 1/2 



1 



*n+l 



.'72 



ao^ 



2(2n + 3)(n + l)' 

2 3 

1/2 [i _ fi _|_ ^ JL. 



3! 5! 7! 

The series converges for all x to yi = cos v^r and ^2 = sin y^ic. The general solution is 

V = C 1 y 1 + C 2 y 2 , 

or 

2/(x) = Ci cos v 7 ^ + C2 sin \fx. 

Note that y(x) is real and non-singular for x € [0, 00). However, the first derivative 

, sin^/x . n cos^i 



(4.75) 
(4.76) 

(4.77) 
(4.78) 

(4.79) 



is singular at x = 0. The nature of the singularity is seen from a Taylor series expansion of y'(x) about 
x = 0, which gives 



y'(x) ~ d 



1 ,T 

"2 + 12 



+ C 2 



1 



2y^ 



(4.80) 



So there is a weak l/y/x singularity in y 1 (x) at a; = 0. 

For y(0) = 1, y'(0) < 00, the exact solution and the linear approximation to the exact solution are 
shown in Fig. 14.31 For this case, one has C\ = 1 to satisfy the condition on y(0), and one must have 
C 2 = to satisfy the non-singular condition on y'(0). 
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I 

Example 4-4 

Find the series solution of 

xy" -y = 0, (4.81) 

around x = 0. 

Let y = Y.n=o a nX n+r - Then, from Eq. flj^QJ 

OO 

r{r - l)a ^ _1 + J2 (( n + r){n + r - l)a„ - a n -i) x^ 1 - 1 = 0. (4.82) 

n=l 

The indicial equation is r(r — 1) = 0, from which r = 0, 1. 
Consider the larger of the two, i.e. r = 1. For this we get 

-Or,-!, (4.83) 



n\(n+l)\ 
Thus, 



n(n + 1) 

ao. (4.84) 



yi(^) = ^ + ^ 2 + —x 3 + j^x 4 + .... (4.85) 



From Eq. (|4.58[) , the second solution is 

OO 

y2(x) = ky 1 (x)\nx + Y / b nX n - (4.86) 

It has derivatives 

y ' 2 ( x ) = k^^- + ky , 1 (x)\nx + J2 nb nX n ~\ (4.87) 

X n=0 

y''{x) = -k^^- + 2k^^- + ky'l(x)\nx + Yn{n-l)b n x n - 2 . (4.88) 

x £ x '-^ 

?i=0 

To take advantage of Eq. (|4.81[) . let us multiply the second derivative by x. 

xy'iix) = -k 1 ^^- + 2ky' l {x) + kxy';{x)\nx + ^n(n-l)b n x n - 1 . (4.89) 



x 



Now since y\ is a solution of Eq. (|4.81[) . we have xy'[ = y\\ thus, 

xy'i[x) = -k^^- + 2ky' 1 (x)+ky 1 (x)\nx + J2 n ( n - 1 ) b nX n ~ 1 - (4.90) 

X n=0 

Now subtract Eq. (|4.86[l from both sides and then enforce Eq. (|4.81j) to get 

= xyn(x) - y 2 (x) = -k^-^- + 2ky[(x) + kyUx) \nx + V n(n - l)^™ -1 

x ■'--' 

n=0 

- [ky x {x)\nx + Y^b n x n \ . (4.91) 
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Simplifying and rearranging, we get 

- *¥}W + 2ky' 1 {x) + ^n(n- l)b n x n - 1 - j^ b n x n = 0. (4.92) 

n=0 n=0 

Substituting the solution yi(x) already obtained, we get 

= -k(l + -x + —x 2 + ...)+2k(l + x+-x 2 + ...j 

+ (2b 2 x + 6b 3 x 2 + ...)- (6 + hx + b 2 x 2 + ...). (4.93) 

Collecting terms, we have 

k = b , (4.94) 

1 / k(2n+l)\ „ 

n(n + 1) \ n\(n + 1)!/ 



Thus, 



/\ 7 i t I -1 " 2 '3 ^^ 4 

U2(a;) = DnVi In a; + On 1 cc x x — . 

y K ' y \ 4 36 1728 

+h (x + \x 2 + -^x 3 + -^x 4 + . . .V (4.96) 



Since the last part of the series, shown in an under-braced term, is actually yi(x), and we already have 
C\y\ as part of the solution, we choose b\ = 0. Because we also allow for a C 2 , we can then set &o = 1. 
Thus, we take 

y 2 (x) = Vl \nx+(l-\x 2 -l- & x 3 -^x 4 -...). (4.97) 

The general solution, y = C\y\ + C 2 y 2 , is 

y(x) = d (x + ^x 2 + -^x 3 + -^x 4 + . 



Ill 



I x + -x 2 + —x 3 + —x 4 + ..)\nx+(l- -x 2 - —x 3 - —x 4 -...)) .(4.98) 
11 2 12 144 J V 4 36 1728 



i/2 

It can also be shown that the solution can be represented compactly as 

y(x) = yfc{Cih(2y/x) + C 2 K x {2yft)) , (4.99) 

where I\ and K\ are what is known as modified Bessel functions of the first and second kinds, respec- 
tively, both of order 1. The function I\(s) is non-singular, while K\(s) is singular at s = 0. 

I 
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4.1.2.3 Irregular singular point 

If P(a) = and in addition either (x — a)Q/P or (x — a) 2 R/P is not analytic at x = a, this 
point is an irregular singular point. In this case a series solution cannot be guaranteed. 

4.1.3 Higher order equations 

Similar techniques can sometimes be used for equations of higher order. 



I 

Example 4-5 

Solve 



around x = 0. 
Let 

from which 



•?'.'/ 



>/" 



y "> - X y = 0, (4.100) 



y = Y j a n x n , (4.101) 

n=0 



Y,a n -ix n , (4.102) 

oo 

6a 3 + J2( n + !)( n + 2 )( n + 3)a n+3 x n . (4.103) 



Substituting into Eq. (|4.100|> . we find that 

a 3 = 0, (4.104) 



a n +3 = 7 — 7T-, —777 —rrOn-l, (4.105) 

(n + l)[n + 2)(n + 3) 



which gives the general solution 



y(x) = oo 1H x* H a; + . . 

24 8064 



aix 1 H a; + 



60 30240 

+a 2 x 2 (l + x 4 + x 8 + ...). (4.106) 

V 120 86400 / v ' 

For y(0) = l,y'(0) = 0,y"(0) = 0, we get ao = 1, Oi = 0, and a2 = 0. The exact solution and the 
linear approximation to the exact solution, y ~ 1 + ce 4 /24, are shown in Fig. 14.41 The exact solution is 
expressed in terms of one of the hypergeometric functions, see Sec. 110.7.81 of the Appendix, and is 

* = oi* ({};{|,§};£). (4.107) 



ICC BY-JVC-lvm 29 July 2012, Sen & Powers. 



4.2. PERTURBATION METHODS 



115 




exact //y=1 +x 4 /24 



/ y"'-xy = 0, 
// y(0) = 1, 
' y'(0) = o, 
y" (0) = o. 




Figure 4.4: Comparison of truncated series and exact solutions. 

4.2 Perturbation methods 

Perturbation methods, also known as linearization or asymptotic techniques, are not as 
rigorous as infinite series methods in that usually it is impossible to make a statement 
regarding convergence. Nevertheless, the methods have proven to be powerful in many 
regimes of applied mathematics, science, and engineering. 

The method hinges on the identification of a small parameter e, < e < 1. Typically 
there is an easily obtained solution when e = 0. One then uses this solution as a seed to 
construct a linear theory about it. The resulting set of linear equations are then solved 
giving a solution which is valid in a regime near e = 0. 

4.2.1 Algebraic and transcendental equations 

To illustrate the method of solution, we begin with quadratic algebraic equations for which 
exact solutions are available. We can then easily see the advantages and limitations of the 
method. 



I 

Example 4-6 

For < e <§: 1 solve 



1 = 0. 



(4.108) 



Let 



x = xq + exi + e X2 



Substituting into Eq. (|4.108|) . 

(x + exi + e 2 x 2 + • • •) 



+e (x + exi + e 2 x 2 



-1 = 0, 



(4.109) 
(4.110) 
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x + e x - 1 = 




i 2 ^-^3 e 



linear 
exact 



Figure 4.5: Comparison of asymptotic and exact solutions. 



expanding the square by polynomial multiplication, 
[x + 2xix e + (xf + 2x 2 x ) e 2 + . . .) 
Regrouping, we get 



[x e + xie 



1 = 0. 



(a;5 - 1) e u + (2xix + x ) e 1 + {x{ + 2x x 2 + Xi) e + . . . = 0. 

=o 



(4.111) 



(4.112) 



Because e , e , e , ..., are linearly independent, the coefficients in Eq. (|4.112p must each equal zero. 
Thus, we get 



O(e ) 
0(e 1 ) 



Xq — 1 = => xq = 1, — 1, 

2a; a;i + a^o = 0=> xi = — jj — o; 

sf + 2a; a;2 + xi = =>• x 2 = |, — g, 



(4.113) 



The solutions are 



and 



x = 1 



e e 



e e 

x = -l 

2 8 



The exact solutions can also be expanded 



x = - (~e± Ve 2 +4 

e e 2 

= ±1- -± — + .. 
2 8 

to give the same results. The exact solution and the linear approximation are shown in Fig. 14.5 



(4.114) 

(4.115) 

(4.116) 

(4.117) 
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I 

Example 1^.1 

For < e < 1 solve 

ex 2 + £-1 = 0. (4.118) 

Note as e — > 0, the equation becomes singular. Let 

x = x + ext + e 2 x 2 H . (4.119) 

Substituting into Eq. (|4.118p . we get 

e (x + exi + e 2 x 2 H ) 2 + (x + exi + e 2 x 2 H ) = 0. (4.120) 

-> . ' >■ . ' 

x 2 x 

Expanding the quadratic term gives 

e (x 2 + 2ex xi + ■••) + (x + exi + e 2 x 2 H )-l = 0, (4.121) 

(x - 1) e° + {xl + Xl ) e 1 + {2x oXl + x 2 ) e 2 + ■ ■ ■ = 0. (4.122) 

=0 =0 =0 

Because of linear independence of e , e , e , . . ., their coefficients must be zero. Thus, collecting different 
powers of e, we get 



0(e°) 

0(e 2 ) 



Xq — 1 = => Xq = 1, 

x 2 , + xi = => xi = — 1, 
2x xi+x 2 = 0=> x 2 = 2, (4.123) 



This gives one solution 

To get the other solution, let 



x = 1 - e + 2e 2 + 






Equation (|4.118[) becomes 

e 2a+1 X 2 + e a X -1 = 0. 

The first two terms are of the same order if 2a + 1 = a. This demands a = — 1. With this, 

X = xe, e- 1 X 2 + e~ 1 X - 1 = 0. 

This gives 

X 2 +X-e = 0. 

We expand 

X = X + eXi + e 2 X 2 + ■ ■ ■ , 

so 

(X + eX 1 + £ 2 X 2 + ---) 2 + (X + eXi+e 2 X 2 + ■••)-£ = 0, 



x^ x 

r 2 i O^ V V I ,-2 / "y 2 | OV V '\ I ^ I ( V I ^V I ^2" 



(4.124) 


(4.125) 


(4.126) 


(4.127) 


(4.128) 


(4.129) 


(4.130) 


(4.131) 



(X 2 + 2eX X 1 +e 2 {X 2 + 2X X 2 ) + ---) + (X + eX 1 + e 2 X 2 + ■■■)- e = 0. 
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asymptotic/ 




exact 
_ _ -w - -asymptotic 

asymptotic 

Figure 4.6: Comparison of asymptotic and exact solutions. 



Collecting terms of the same order 



O(e ) : 


X + X - 


~- 0^ X = -l, 


0. 


O(e') : 


2^o^i + X\ - 


= 1=> Xi = -1, 


1. 


0(6 a ) : 


X\ + 2X„X 2 + X 2 -- 


= => X a = 1, 


-1 



(4.132) 



gives the two solutions 



or, with X = xe, 



X 
X 



x 
x 

Expansion of the exact solutions 

1 

x = — 

2e 

1 

~~ Ye 



-l-e + e* 

,2 , 



-(-1-e + e 2 
e 



+ •••) 



l-e + 



(-l±v/I+4e), 

(-l±(l + 2e-2e 2 +4e 4 + ••■)) 



gives the same results. The exact solution and the linear approximation are shown in Fig. 14.6 



(4.133) 

(4.134) 



(4.135) 
(4.136) 

(4.137) 
(4.138) 

I 



I 

Example 4-8 

Solve 



cos a; = esin(a; + e), 



(4.139) 
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for x near tt/2. 

Fig. 14.71 shows a plot of cos a; and esin(:E + e) for e 
intersections near x = (n+^Tr), where n = 0, ±1,±2, . 



£ = 0.1 



0.1. It is seen that there are multiple 
We seek only one of these. When we 




esin(x + e) 



Figure 4.7: Location of roots. 

substitute 

x = x + exi + e 2 x 2 + ■ ■ ■ , 
into Eq. (|4.139|) , we find 

cos(a;o + ex\ + e 2 a; 2 +•••) = e sin(:ro + exi + e 2 x 2 + ■ ■ ■ +e) 



(4.140) 
(4.141) 



Now we expand both the left and right hand sides in a Taylor series in e about e = 0. We note that 
a general function /(e) has such a Taylor series of /(e) ~ /(0) + e/'(0) + (e 2 /2)/"(0) + . . . Expanding 
the left hand side, we get 

cos(xo + exi + . . .) = cos(x + exi + . . .)| e=0 



— cos x\ e=0 



— d J de(cos x)\ g _ 



+e (- sin(x + ea;i + ■■•)) (xi + 2ex 2 + ■ ■ ■) 



—d/dx(cosx)\ e=0 — dx/de\ &=0 

cos(a;o + cx\ + . . .) = cosxo — cx\ sina^o + ■ • ■ ■ 

The right hand side is similar. We then arrive at Eq. (|4.139[) being expressed as 

cosxo — £x± sinxo + . . . = e(sinxo + ...). 

Collecting terms 

=> x = f 



£=0 



O(e ) : 
Oie 1 ) : 



cos^o 
-Xi sinxo — sinxo 



=> xi 



1. 



(4.142) 
(4.143) 

(4.144) 
(4.145) 



The solution is 



7T 
X = — 

2 



(4.146) 
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4.2.2 Regular perturbations 

Differential equations can also be solved using perturbation techniques. 



I 

Example 4-9 

For < e < 1 solve 



y" + ty 2 = o, (4.147) 

y(0) = 1, 1/(0) = 0. (4.148) 



Let 



y(x) = yo(x)+e yi (x) + e 2 y 2 (x) + ---, (4.149) 

y'(x) = y' (x)+ey[(x) + e 2 y 2 (x) + ---, (4.150) 

/(*) = ^(x)+ e yi'( 2; ) + 6 2 ^(x) + .--. (4.151) 

Substituting into Eq. (|4.147|) . 

(y^{x) + ty'((x)+e 2 y' 2 \x) + ■■■)+€ (y {x) + ey 1 {x)+e 2 y 2 {x) + ---) 2 = 0, (4.152) 

" * ' v * ' 

y" y 2 

( y {(x) + ey';(x) + e 2 y 2 \x) + ---)+e(y 2 (x) + 2ey 1 (x) yo (x) + ---) = 0. (4.153) 

Substituting into the boundary conditions, Eq. (|4. 1481) : 

(4.154) 
(4.155) 

Collecting terms 

O(e ) 

0(e 2 ) ■ iit! -27M7H. im(()\ = ()- H ^f)1 = (1 =^ 7/o = 4_ (4.156) 



The solution is 

2 4 

2/ = l- £ y + £ 2 ^ + .... (4.157) 





yo(0) + e yi (0) + e 2 y 2 (0) + ••• = !, 








^(0)+eyi(0) + e 2 y 2 (0) + --- = 0. 






: 2/o' 


= 0, lto(0) = 1, y (0) = ^ 


2/o = 


1, 


: y'{ 


= -yg, yi(0) = 0, yi(0) = 0^ 


2/1 = 


,2 


■■ y'i 


= -2j/oJ/i, 1/2(0) = 0, y' 2 (0) = 0^ 


2/2 = 


x 4 
12 ' 



For validity of the asymptotic solution, we must have 



1 » ey . (4.158) 
This solution becomes invalid when the first term is as large or larger than the second: 

1 < ey, (4.159) 

\x\ > sj\. (4.160) 
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Using the techniques of the previous chapter it is seen that Eqs. (|4.147l 14. 148)) possess an exact 
solution. With 

dy d 2 y dy 1 dy du 

dx ' dx 2 dy dx dy 



uy u, y u,y u,y wu, 

u =— - 3Z2=— — = —. M > ( 4 - 161 ) 



Eq. (|4.147|) becomes 



du 
dy 



u—- + ey 2 = 0, 



udu = —ey dy, 
u z e 



u = when y = 1 so C = — , 

±1 

dy_ ± 

dx 

dx = ± 



y! 


(1- 


-2/ 3 ), 


y! 


(1- 

dy 


-2/ 3 ), 


v/f 


(1- 


-?/ 3 ) 


r 




ds 



(4462) 


(4463) 


(4464) 


(4465) 


(4466) 


(4467) 


(4468) 



x = ± :. (4469) 

A /!(i- 3 ) 

It can be shown that this integral can be represented in terms of a) the Gamma function, T, (see 
Sec. 110.741 of the Appendix), and b) Gauss's^l hypergeometric function, 2-^1(0, b, c, z), (see Sec. I10.7~8l 
of the Appendix), as follows: 

It is likely difficult to invert either Eq. (|4.169[) or (|4470[) to get y(x) explicitly. For small e, the 
essence of the solution is better conveyed by the asymptotic solution. A portion of the asymptotic and 
exact solutions for e = 0.1 are shown in Fig. 14.81 For this value, the asymptotic solution is expected to 
be invalid for |x| > y/2/e = 4.47. 

I 



I 

Example 4-10 

Solve 



Let 



y" + ey 2 = 0, y(0) = 1, y'(0) = e. (4.171) 



y(x) = y (x) + e yi (x) + e 2 y 2 (x) + ■■■. (4.172) 



3 Johann Carl Friedrich Gauss, 1777-1855, Brunswick-born German mathematician of tremendous influ- 
ence. 
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far-field view 

y 









/ / 
1 / 

1 / 

1 / 
1/ 


-5 


s-tr^^o is 

X \ 
X \ 
X \ 

X \ 
\ \ 

asymptotic \ 


' 1 y" + e y 2 = 


-10 


V\ 


y(Q) = 1 

y'(0) = o 

e = 0.1 


-15 
-20 


exact 1 



close-up view 

y 



exact 




Figure 4.8: Comparison of asymptotic and exact solutions. 



Substituting into Ea. ()4.l7l|) and collecting terms 






0(e°) : 


Vo = 0, Vo(0) = l, 


2/o(0) 


= => 2/0 = 1, 


0(e l ) : 


y'{ = -yl yi(o) = o, 


wi(o) 


= 1 => yi = - ^ + x 


0( £ 2 ) : 


y'i = -2yoyi, j/ 2 (0) = 0, 


1^(0) 


= 0=^ W9 = — — — 

u =* </2 12 3 i 


The solution is 









2/= - c ( -r- - x I + e z ( 



a; 4 a; 3 



+ 



(4.173) 



(4.174) 



2 y ' V12 3 

A portion of the asymptotic and exact solutions for e = 0.1 are shown in Fig. 14.91 Compared to the 




asymptotic 
Figure 4.9: Comparison of asymptotic and exact solutions. 

previous example, there is a slight offset from the y axis. 
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4.2.3 Strained coordinates 

The regular perturbation expansion may not be valid over the complete domain of interest. 
The method of strained coordinates, also known as the Poincarcl-Lindstedto method, is 
designed to address this. In a slightly different context this method is known as Lighthill'sG 
method. 



I 

Example 1^.11 

Find an approximate solution of the Duffing equation: 

x + x + ex 3 = 0, a:(0) = l, x(0) = 0. (4.175) 

First let's give some physical motivation, as also outlined in Section 10.2 of Kaplan. One problem in 
which Duffing's equation arises is the undamped motion of a mass subject to a non-linear spring force. 
Consider a body of mass m moving in the horizontal x plane. Initially the body is given a small positive 
displacement x(0) = x . The body has zero initial velocity dx/dt(Q) = 0. The body is subjected to a 
non- linear spring force F s oriented such that it will pull the body towards x = 0: 

F s = (ko + fax 2 )x. (4.176) 

Here fco and fei are dimensional constants with SI units N/m and N/m 3 respectively. Newton's second 

law gives us 

d 2 x 
m—r = -(k + k 1 x 2 )x, (4.177) 



m—2+(k + k 1 x 2 )x = 0, x(0) = x o ,—(0) = 0. (4.178) 



dx 
dt 2 ' v ' u ' ■• i- '- ~ v ~' -"#' 

Choose an as yet arbitrary length scale L and an as yet arbitrary time scale T with which to scale the 
problem and take: 

x = j-, l= 1 -. (4.179) 

Substitute 

mL d 2 x o o , . L dx , , , 

— T - I - + k Lx + k 1 L 3 x 3 = 0, Lx(0)=x Ol __(0) = 0. (4.180) 

1* dt z 1 dt 

Rearrange to make all terms dimensionless: 

d 2 x k Q T 2 k x L 2 T 2 , , , x dx , N , t 

+ ^ x+ ^ ^3 = 0, J0=f, lf 0=0 - ( 4 - 181 ) 

dt z m m L dt 

Now we want to examine the effect of small non-linearities. Choose the length and time scales such 
that the leading order motion has an amplitude which is 0{\) and a frequency which is 0(1). So take 



So 



T= , 


m 


d l X Kl^ol^ „ 3 

—~- + x H -x 

dt z m 


= 0, 



(4.182) 



dx 
x(0) = 1, "tf(0) = 0. (4.183) 

dt 



4 Henr FPoincare| 1854-1912, French polymath. 

5 Anders Lindstedt, 1854-1939, Swedish mathematician, astronomer, and actuarial scientist. 

e Sir Michael James Lighthill, 1924-1998, British applied mathematician and noted open-water swimmer. 
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Choosing 



we get 



kix 2 
k 



d 2 x 

-=- + x + ex 3 = 0, x(0) = 1, ^(0) = 0. 

at 2 - 



dx 
di 



So our asymptotic theory will be valid for 

e«l, kixl <C k Q . 



(4.184) 

(4.185) 

(4.186) 



Now, let's drop the superscripts and focus on the mathematics. An accurate numerical approxima- 
tion to the exact solution x(t) for e = 0.2 and the so-called phase plane for this solution, giving dx/dt 
versus x are shown in Fig. 14.101 



X" + X+ EX =0 

x(0) = 1, x'(0) = 




0\/ W \/6l 



8 = 0.2 



too 




Figure 4.10: Numerical solution x(t) and phase plane trajectory, dx/dt versus x for Duffing's 
equation, e = 0.2. 

Note if e = 0, the solution is x(t) = cos t, and thus dx/dt = — sin t. Thus, for e = 0, x 2 + (dx/dt) 2 = 
cos 2 t + sin t = 1. Thus, the e = phase plane solution is a unit circle. The phase plane portrait of 
Fig. 14.101 displays a small deviation from a circle. This deviation would be more pronounced for larger 
e. 

Let's use an asymptotic method to try to capture this solution. Using the expansion 



x(t) = x (t) + exi(t) + e 2 x 2 (t) H , 

and collecting terms, we find 

O(e ) : io+xo = 0, xq(Q) = 1, x (0) = =^ xq = cost, 

0(e 1 ) : x\ + x\ = —Xq, £i(0) = 0, ±i(0) = 0=> x\ = ggC - cosi + cos3i — 12tsini), u 



(4.187) 
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Numerical - 0(1) 




WW 




Numerical - [0(1) + 0(e)] 
Corrected 



20 40 



y^^y^., 1 



Figure 4.11: Error plots for various approximations from the method of strained coordinates 
to Duffing's equation with e = 0.2. Difference between high accuracy numerical solution and: 
a) leading order asymptotic solution, b) uncorrected O(e) asymptotic solution, c) corrected 
O(e) asymptotic solution. 

The difference between the exact solution and the leading order solution, x exac t(t) — Xo(t) is plotted 
in Fig. 14.11b . The error is the same order of magnitude as the solution itself for moderate values of t. 
This is undesirable. 

To 0(e) the solution is 



X = cos t 



e 

32 



cos £ + cos 3£ — 12£sint 

secular ternV 

34 



(4.189) 



This series has a so-called "secular term," — e^tsint, that grows without bound. Thus, our solution is 
only valid for t <C e . 

Now nature may or may not admit unbounded growth depending on the problem. Let us return to 
the original Eq. (|4. 175ft to consider whether or not unbounded growth is admissible. Eq. (|4. 175ft can 
be integrated once via the following steps 



(x + 



x + ex 



d (\ 



dt \2 



xx + xx + exx 

•2 1 2 £ t 

c H — x H — x 
2 4 

1. 2 1 2 e 
-x 2 + -x 1 + - 



■2 



—x H — x 

2 2 



r 



- A (2 + e), 



(4.190) 

(4.191) 

(4.192) 
(4.193) 
(4.194) 



indicating that the solution is bounded. The difference between the exact solution and the leading 
order solution, x exact (t) — (xo(t) + exi(t)) is plotted in Fig. 14. lib . There is some improvement for early 
time, but the solution is actually worse for later time. This is because of the secularity. 
To have a solution valid for all time, we strain the time coordinate 



t = (l + cie + c 2 e 2 H )r, 



where r is the new time variable. The Cj's should be chosen to avoid secular terms. 
Differentiating 

dx dr dx ( dt^ 
dr dt dr \ dr 



(4.195) 



(4.196) 
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U/X f 9 

— l + cie + c 2 e 2 
clt 



(1 + Cl e + c 2 e 2 + •■•)" 



fx 
d^ 2 

j4 (1 - cie + (c 2 - c 2 )e 2 + 



d 2 ^ 
rfr 2 



(1 -2cie+ (3c 2 -2c 2 )e 2 + •••)• 



Furthermore, we write 

x = xo + exi + e 2 X2 

Substituting into Eq. (|4.175|l . we get 



(4.197) 
(4.198) 
(4.199) 
(4.200) 

(4.201) 



d 2 xo d 2 xi r,d 2 xi 



dr 2 



dr 2 



dT 2 



(1 - 2c l£ + (3c 2 - 2c 2 )e 2 



+ (xq + exi + e 2 x 2 + ■■■)+£ (x + exi + e 2 x 2 + •••)* 



0. 



(4.202) 



Collecting terms, we get 



0(e°) 

0{e l ) 



dT 2 -r x 
x ( T ) 

d Xi 
dr 2 



Xl 



0, 

COST, 

2 Cl ^ 



'{)• 



xi (r) 



-2ci cost — cos 3 r, 

- (2ci + | ) cos t — j cos 3r, 



x o (0) = l, ^(0) = 0, 
a;i(0) = 0, ^(0) = 0, 



;, 2 \ 



cos r + cos 3t) , if we choose c\ 



Thus, 



Since 



x(t) = cos t + e — (— cos r + cos 3r) 



1-e! 



1 + 4 



we get the corrected solution approximation to be 
/ \ 

/ Q \ 

x(t) = CO 



(4.203) 

(4.204) 

(4.205) 
(4.206) 



\Frcquency Modulation (FM) / 



-e— ( - cos ( ( 1 + e- 



cos (3(1 + e — 



t + 



(4.207) 



The difference between the exact solution and the leading order solution, x exac t{t) — {xo(t) + ex\(t)) 
for the corrected solution to O(e) is plotted in Fig. 14.11b . The error is much smaller relative to the 
previous cases; there does appear to be a slight growth in the amplitude of the error with time. This 
might not be expected, but in fact is a characteristic behavior of the truncation error of the numerical 
method used to generate the exact solution. 
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I 

Example 4. 12 

Find the amplitude of the limit cycle oscillations of the van der PoJJ equation 

x - e(l - x 2 )x + x = 0, x(0)=A, x(0) = 0, < e < 1. (4.208) 

Here A is the amplitude and is considered to be an adjustable parameter in this problem. If a limit cycle 
exists, it will be valid as t — > 00. Note this could be thought of as a model for a mass-spring-damper 
system with a non- linear damping coefficient of — e(l — x 2 ). For small |x|, the damping coefficient 
is negative. From our intuition from linear mass-spring-damper systems, we recognize that this will 
lead to amplitude growth, at least for sufficiently small \x\. However, when the amplitude grows to 
|a;| > 1/ \fe, the damping coefficient again becomes positive, thus decaying the amplitude. We might 
expect a limit cycle amplitude where there exists a balance between the tendency for amplitude to grow 
or decay. 



Let 



t = (1 + Cl e + c 2 e 2 + ■ ■ -)t, (4.209) 



so that Eq. (|4.208|) becomes 

d 2 x , .dx 



, (1 - 2cie + . . .) - e(l - x 2 )^f (1 - cie + . . .) + x = 0. (4.210) 

dr z dr 



We also use 
Thus, we get 
to 0(e). To O(e), the equation is 



x = x + exi + e 2 x 2 + (4.211) 

£ = A cost, (4.212) 



A 2 \ A 3 
i + xi = -2c\A cos r - A ( 1 ) sin r -\ sin 3r. (4.213) 



dr 2 V 4 / 4 

Choosing c\ = and A = 2 in order to suppress secular terms, we get 



3 1 

xi = — sinr sin3r. (4.214) 

4 4 



(4.215) 



(4.216) 



The amplitude, to lowest order, is 

A = 2, 

so to 0(e) the solution is 

x(t) = 2 cos (t + 0(e 2 )) + e (^ sin (t + 0(e 2 )) - i sin (3 (t + 0(e 2 )))j + 0(e 2 ). 

The exact solution, x exac t, i exac t, calculated by high precision numerics in the x, i phase plane, x(t), 
and the difference between the exact solution and the asymptotic leading order solution, x exac t(t) — 
xo(t), and the difference between the exact solution and the asymptotic solution corrected to O(e): 
x e xact(t) — (xo(t) + ex\(t)) is plotted in Fig. 14.121 Because of the special choice of initial conditions, the 
solution trajectory lies for all time on the limit cycle of the phase plane. Note that the leading order 
solution is only marginally better than the corrected solution at this value of e. For smaller values of 
e, the relative errors between the two approximations would widen; that is, the asymptotic correction 
would become relatively speaking, more accurate. 



IBalthasar van der Polj 1889-1959, Dutch physicist. 
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dx/dt 






Figure 4.12: Results for van der Pol equation, d 2 x/dt 2 — e(l — x 2 )dx/dt + x = 0, x(0) = 
2,i(0) = 0, e = 0.3: a) high precision numerical phase plane, b) high precision numeri- 
cal calculation of x(t), c) difference between exact and asymptotic leading order solution 
(blue), and difference between exact and corrected asymptotic solution to 0(e) (red) from 
the method of strained coordinates. 



J 



4.2.4 Multiple scales 

The method of multiple scales is a strategy for isolating features of a solution which may 
evolve on widely disparate scales. 



I 

Example 4-13 

Solve 



d 2 x , ?.dx 



x(0) = 0, 



dx 

~dl 



(0) = 1, 



0<e<l. 



(4.217) 



Let x = x(t, t), where the fast time scale is 

r = (1 + aie + a 2 e 2 H )t, 



and the slow time scale is 
Since 



The first derivative is 



f = et. 



X = 


= X{T,T), 


dx 

~dt 


dx dr dx df 
dr dt df dt 



dx dx . 2 s dx 

dt OT OT 

— = (l + ai e + a 2 e 2 + ■■■)—+ e— . 
dt dr dr 



(4.218) 
(4.219) 

(4.220) 

(4.221) 

(4.222) 
(4.223) 
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Applying this operator to dx/dt, we get 



d 2 x M 


i 1 e + a 2 e 2 H ) : 


OT 1 


s d 2 x 
' e drdf 


+ ' 


2 d 2 x 
£ df 2 


Introduce 


x = 
becomes 


£ + exi + e 2 x 2 + 










So to U{e), Eq. (14.21711 




(l + 2o 1 e + -- 


d 2 (xq + exi + 

} dT 2 


•■■) , . c 9 2 ( 2; o + - 
drdr 


••>+... 


exi + ■ 

X 


••) 






-e(l-x 2 - 


or 


■ • • + (» + 


= 0. 






(1— a; 2 )i 






Collecting terms of 0{t 


), we have 

u 4- In - n 


with wn.o') - n. 


^l(o.ci) - 


= 1. 







<9r 2 
The solution is 



with 



df 

Since et is already represented in f , choose a\ = 0. Then 

A 



Since A(0) = 0, try A(f) = 0. Then 



(4.224) 
(4.225) 



(4.226) 



(4.227) 



x = A(f) cost + B(t) sin r with A(0) = 0, 5(0) = 1. (4.228) 

The terms of (^(e 1 ) give 

<9 2 £i 9 2 xn d 2 xn , ->,dx 

2ai-B + 2A'-A + -(A 2 + .B 2 ) J sinr 

2a x A -2B' + B (A 2 + S 2 ) J cost 

A B 

--(A 2 -3£ 2 )sin3T- — (3A 2 - B 2 ) cos3r. (4.230) 



ii(0,0) = 0, (4.231) 

dx± , . dxn , , 9xn . . 

-i-0,0 = - ai — ^ 0,0 --# 0,0 , 4.232 
ar or ar 

-1-^(0,0). (4.233) 



2A'- A+ ^-(A 2 +£ 2 ) = 0, (4.234) 

2B' - B + —(A 2 + B 2 ) = 0. (4.235) 



B 3 

2B'-B+ — =0. (4.236) 
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Multiplying by B, we get 



Taking F = B 2 , we get 



2BB' - 


4 


= 


(B 2 y - 


2 B 4 
-B 2 + — 
4 


= 


F' 


F 2 

- F+ T- 


= 0. 



(4.237) 

(4.238) 

(4.239) 



This is a first order ODE in F, which can be easily solved. Separating variables, integrating, and 
transforming from F back to B, we get 



^r=Ce f . (4.240) 



1 4 



Since B(0) = 1, we get C = 4/3. From this 



so that 



B= - =, (4.241) 

Vl + 3e-^ 



2 
x(t,t) = , sinr + Q(e), (4.242) 

V 1 + 3e _r 

x(i) = = _ sin((l + 0(e 2 ))i)+0(e). (4.243) 



v / TT3 



-<-./ 



Amplitude Modulation (AM) 

The high precision numerical approximation for the solution trajectory in the (x,x) phase plane, 
the high precision numerical solution x exac t(t), and the difference between the exact solution and the 
asymptotic leading order solution, x exac t{t) — Xo(t), and the difference between the exact solution and 
the asymptotic solution corrected to 0(e): x exac t{t) — {xa(t) + ex\(t)) are plotted in Fig. 14.131 Note 
that the amplitude, which is initially 1, grows to a value of 2, the same value which was obtained 
in the previous example. This is evident in the phase plane, where the initial condition does not lie 
on the long time limit cycle. Here, we have additionally obtained the time scale for the growth of 
the amplitude change. Note also that the leading order approximation is poor for t > 1/e, while the 
corrected approximation is relatively good. Also note that for e = 0.3, the segregation in time scales 
is not dramatic. The "fast" time scale is that of the oscillation and is 0(1). The slow time scale is 
0(l/e), which here is around 3. For smaller e, the effect would be more dramatic. 

I 



4.2.5 Boundary layers 

The method of boundary layers, also known as matched asymptotic expansion, can be used 
in some cases. It is most appropriate for cases in which a small parameter multiplies the 
highest order derivative. In such cases a regular perturbation scheme will fail since we lose 
a boundary condition at leading order. 
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dx/dt 













i 


1 






-1 




: 1/ ]J 






-1 




a) 




=s 





b) 



envelope 




envelope 



t o 




Figure 4.13: Results for van der Pol equation, d 2 x/dt 2 — e(l — x 2 )dx/dt + x = 0, x(0) = 
0, x(0) = l,e = 0.3: a) high precision numerical phase plane, b) high precision numeri- 
cal calculation of x(t), along with the envelope 2/vI + 3e~ et , c) difference between exact 
and asymptotic leading order solution (blue), and difference between exact and corrected 
asymptotic solution to 0(e) (red) from the method of multiple scales. 



I 

Example 4-1 4 

Solve 



ey 



y' + y = 0, 



1/(0) = 0, 2/(1) = 1. 



(4.244) 



An exact solution to this equation exists, namely 

y{x) = exp 



- X \^M X -^ E 



2f 



sinh 



(^F) 



(4.245) 



We could in principle simply expand this in a Taylor series in e. However, for more difficult problems, 
exact solutions are not available. So here we will just use the exact solution to verify the validity of the 
method. 

We begin with a regular perturbation expansion 



y(x) =y + eyi(x) + e 2 y 2 (x) 

Substituting and collecting terms, we get 

O(e°):y' + y = 0, y (0) = 0, 

the solution to which is 

yo = ae~ x . 



M>(1)=1, 



(4.246) 



(4.247) 



(4.248) 



It is not possible for the solution to satisfy the two boundary conditions simultaneously since we only 
have one free variable, a. So, we divide the region of interest x G [0, 1] into two parts, a thin inner 
region or boundary layer around x = 0, and an outer region elsewhere. 

Equation (|4.248|) gives the solution in the outer region. To satisfy the boundary condition J/o(l) = 1> 
we find that a = e, so that 

y = e 1 ~ x + ■■■. (4.249) 
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In the inner region, we choose a new independent variable X defined as X = x/e, so that the equation 

becomes 

d 2 y dy 



dX 2 dX 
Using a perturbation expansion, the lowest order equation is 

d 2 yo , dy 



+ -£ + zy = °- ( 4 - 25 °) 



dX 2 dX 
with a solution 



0, (4.251) 



y = A + Be~ x . (4.252) 

Applying the boundary condition 2/o(0) = 0, we get 

yo = A(l - e~ x ). (4.253) 

Matching of the inner and outer solutions is achieved by (Prandtl'qj method) 

VinneriX -> Co) = y ou ter(x -> 0), (4.254) 

which gives A = e. The solution is 

y(x) = e(l — e~ x ' e ) + • ■ ■ , in the inner region, (4.255) 

lim y = e, (4.256) 

x — >oo 

and 

y(x) = e l ~ x + ■ ■ ■ , in the outer region, (4.257) 

lim y = e. (4.258) 

x— >0 

A composite solution can also be written by adding the two solutions. However, one must realize that 
this induces a double counting in the region where the inner layer solution matches onto the outer layer 
solution. Thus, we need to subtract one term to account for this overlap. This is known as the common 
part. Thus, the correct composite solution is the summation of the inner and outer parts, with the 
common part subtracted: 

y(x) = (e(l - e- x ><) + ■■■) + (e 1 "* + • ■ ■) - e , (4.259) 



common part 



-x/e 



y = e(e-*-e-" /e ) + ---. (4.260) 

The exact solution, the inner layer solution, the outer layer solution, and the composite solution are 
plotted in Fig. 14.141 

I 



I 

Example 4-15 

Obtain the solution of the previous problem 



«/' + V + V = 0, 2/(0) = 0, 2/(1) = 1, (4.261) 



s Ludwig Prandtl, 1875-1953, German engineer based in Gottingen. 
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Outer Layer 
Solution 



1.5 




Inner Layer 
Solution 



e y" + y' + y = 
y (0) = 
y(1) = 1 

8 = 0.1 



Prandtl's 

Boundary Layer Method 



0.2 0.4 0.6 0. 



Figure 4.14: Exact, inner layer solution, outer layer solution, and composite solution for 
boundary layer problem. 



to the next order. 

Keeping terms of the next order in e, we have 

y = e 1 
for the outer solution, and 

y = A(l- e~ x ) + e(B-AX-{B + AX)e~ x ) + ..., 



-" + e((l - x)e 1 ~ x ) + 



(4.262) 



(4.263) 



for the inner solution. 

Higher order matching (Van Dyke'cl method) is obtained by expanding the outer solution in terms 
of the inner variable, the inner solution in terms of the outer variable, and comparing. Thus, the outer 
solution is, as e — ► 



y 



A-eX 



+ e((l-eX)e 1 - eX ) + 



e(l - eX) + ee(l - eXf 



Ignoring terms which are > 0(e 2 ), we get 



.'/ 



e(l- eX) + ee, 
e + ee(l-X), 

= e + ee(l--Y 

= e + ee — ex. 

Similarly, the inner solution as e — > is 

y = A{l-e- x/e ) + e(B-A--(B + A-\e- x/< 

= A + Be- Ax. 



(4.264) 

(4.265) 

(4.266) 

(4.267) 

(4.268) 
(4.269) 

(4.270) 
(4.271) 



s Milton Denman Van Dyke, 1922-2010, American engineer and applied mathematician. 
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Error 



Exact- [0(1) + 0(e)] 



e y" + y' + y = 

y (0) = 

y(1) =1 

e =0.1 

Prandtl's 

Boundary Layer Method 



Figure 4.15: Difference between exact and asymptotic solutions for two different orders of 
approximation for a boundary layer problem. 




Comparing, we get A = B = e, so that 

y(x) = e(l - e~ x ^ e ) + e (e - x - (e + x)e~ x ^ e J + • ■ ■ in the inner region, (4.272) 

and 



y(x) = e x + e(l — x)e x ■ ■ ■ in the outer region, 
The composite solution, inner plus outer minus common part, reduces to 



y = e l ~ x - (1 + x)e 1 ~ x/e + e ((1 - x)e 1 ~ 



x _ gt-x/t 



(4.273) 



(4.274) 



The difference between the exact solution and the approximation from the previous example, and the 
difference between the exact solution and approximation from this example are plotted in Fig. 14.151 

I 



I 

Example 4-16 

In the same problem, investigate the possibility of having the boundary layer at x = 1. The outer 
solution now satisfies the condition y(0) = 0, giving y = 0. Let 



A' 



x- 1 



The lowest order inner solution satisfying y(X = 0) = 1 is 

y = A + (1 - A)e~ x . 



(4.275) 



(4.276) 



However, as X — > — oo, this becomes unbounded and cannot be matched with the outer solution. Thus, 
a boundary layer at x = 1 is not possible. 

I 
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Error 
O.lr 



0.08- 

0.06- 



0.04 




0.2 0.4 



Approximate 

Figure 4.16: Exact, approximate, and difference in predictions for a boundary layer problem. 



I 

Example 1^.11 

Solve 



«/' - V' + y = 0, with j/(0) = 0, 2/(1) = 1. 



The boundary layer is at x = 1. The outer solution is y = 0. Taking 

x- 1 



X 



the inner solution is 



Matching, we get 



x 



y = A+{l-A)e A +. 



A = 0, 



so that we have a composite solution 



V 



(a) = e (a_1)/e + 



(4.277) 

(4.278) 

(4.279) 
(4.280) 

(4.281) 



The exact solution, the approximate solution to O(e), and the difference between the exact solution 
and the approximation, are plotted in Fig. 14.161 

I 



4.2.6 WKBJ method 

Any equation of the form 

can be written as 



d 2 v „ . . dv _ . . 

— + P(x)— + Q(x)v = 0, 

dx z dx 



(4.282) 



g + *M» = o, 



(4.283) 
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where 

v(x) = y{x)ex V (~J P(s)ds), (4.284) 

R(x) = Q( x )-^-±(P(x))\ (4.285) 

So it is sufficient to study equations of the form of Eq. ( 14.2831) . The WentzelJ 10 ! Kramers! 11 ! 
Brillouin£j Jeffreysjlj (WKBJ) method is used for equations of the kind 

e2 = /Wy ' (4286) 

where e is a small parameter. This also includes an equation of the type 

e 2 ^ = (X 2 P (x)+q(x))y, (4.287) 

where A is a large parameter. Alternatively, by taking x = et, Eq. (I4.286P becomes 

% = f(et)y. (4.288) 

We can also write Eq. (I4.286|) as 

P~ 2 = g(x)v, (4-289) 

dx z 

where g(x) is slowly varying in the sense that g' /g 3 ^ 2 ~ 0(e). 
We seek solutions to Eq. (I4.286D of the form 

y{x) = exp ( -- / (S (s) + eSi(s) + e 2 S 2 {s) + • • -)ds] . (4.290) 

The derivatives are 



e -J x 



^ = -(S (x) + eS 1 (x) + e 2 S 2 (x) + ---) y(x), (4.291) 

ax e 

p- 2 = \(S (x) + eS 1 (x)+e 2 S 2 (x) + ---) 2 y(x), 



dx 



1 (don db\ ndo 2 



y{x). (4.292) 



^Gregor Wentzelj 1898-1978, German physicist, 
ii 



Hendrik Anthony Kramers, 1894-1952, Dutch physicist. 



^Leon Brillouin, 1889-1969, French physicist. 
13 Harold Jeffreys, 1891-1989, English mathematician. 
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Substituting into 


Eq. f|4.286|), 


we get 


((So(x)) 2 


+ 2eS (x)S 1 (x) + ---) y(x) + e(-^ + - 






=e 2 d 2 y/dx 2 


Collecting terms, 


at 0(e°) we 


have 

S 2 (x) = f(x), 


from which 




S (x) = ±y/f(x). 


To 0(t l ) we have 




2S (x)S 1 (x) + ^ = 0, 


from which 




dS 

q,( T \ — dx 




J 1\X) oC , V , 

2^0 (z) 




2 (±>//(x)) 









y(x) = f(x)y(x). (4.293) 



4/(x) 
Thus, we get the general solution 

y(x) = Ciexpf-/ (5 (s) + e5i(s) H )ds 



1 



j'ii 



/(*) 



df 



(4.294) 
(4.295) 

(4.296) 

(4.297) 
(4.298) 

(4.299) 



C 2 exp (- f (6*0(5) + eS 1 (s) + • • -)ds) , (4.300) 



+C 2 exp f i jT(- V?W - ^^ + • • ■)<**) > ( 4 -301) 

y(x) = C 1 expf-y' / ^jexpQ£(v / 7M+---)* 



-C 2 exp(- / ^jexp^-ijf (V75) + ■••)<&), (4.302) 

s/00 ^ttj ex P ( -- / \/7(s) rfs ) + ; - , 2 1M exp ( ~7 / v7(s)ds ) + • 



(/w) 1 / 4 "w, v ' w ; (/(,)) i/4 



■■CO 



(4.303) 
ICC BY-IVC-MXl 29 Ju7j 20 J 2, Sen & Powers. 



138 CHAPTER 4. SERIES SOLUTION METHODS 

This solution is not valid near x = a for which f(a) = 0. These are called turning points. 
At such points the solution changes from an oscillatory to an exponential character. 



I 

Example 1^.18 

Find an approximate solution of the Airvl 14 ! equation 



In this case 









n*; 




X. 




Thus, 


X = is a turning point. 


We find that 














So(x) 


= ±i 


ly/x, 




and 






Si(x) = - 


S'o 

2S 


= - 


1 

ix 


The solutions are of the form 











e 2 y" + xy = 0, for x > 0. (4.304) 

(4.305) 
(4.306) 
(4.307) 



y = exp (±- / ^xdx- / -^ j H , (4.308) 

The general approximate solution is 

C x /2x 3 / 2 \ C 2 /2x^ 2 \ 

y =^ Si »{—) + —* COS {—) + -- (4 ' 310) 

The exact solution can be shown to be 

y = CiAi (-e- 2/3 x) + C 2 Bi (-e- 2/3 x) . (4.311) 

Here Ai and Bi are Airy functions of the first and second kind, respectively. See Sec. 110.7.91 in the 
Appendix. 

I 



I 

Example 4-19 

Find a solution of x 3 y" = y, for small, positive x. 

Let e 2 X = x, so that X is of 0(1) when x is small. Then the equation becomes 

^ = *-V (4.312) 



14 George Biddell Airy, 1801-1892, English applied mathematician, First Wrangler at Cambridge, holder of 
the Lucasian Chair (that held by Newton) at Cambridge, Astronomer Royal who had some role in delaying 
the identification of Neptune as predicted by Jo hn Couch Adamsf perturbation theory in 1845. 
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The WKBJ method is applicable. We have / = X~ 3 . The general solution is 

y = C[X 3 / 4 exp (~-^=) + C' 2 X 3 ' A exp (-jJ\ +■■■. (4.313) 

In terms of the original variables 

„ = Cia: 3/4 exp(--2=J + C 2 x 3/4 exp ( -j= J +■■•. (4.314) 

The exact solution can be shown to be 

y=V- X (cj 1 (£)+C 2 K l (^)). (4.315) 

Here I± is a modified Bessel function of the first kind of order one, and K\ is a modified Bessel function 
of the second kind of order one. 



J 



4.2.7 Solutions of the type e 



Six) 



I 

Example 4-20 

Solve 

x 3 y" = y, (4.316) 

for small, positive x. 

Let y = e s ( x \ so that y' = S'e s 7 y" = {S') 2 e s + S"e s , from which 

5" + (S") 2 = x" 3 . (4.317) 

Assume that S" <S (S') 2 (to be checked later). Thus, S' = ±x~ 3/2 , and S = ±2x~ 1 / 2 . Checking we 
get S" /{S') 2 = x 1 ' 2 — > as x — > 0, confirming the assumption. Now we add a correction term so that 
S(x) = 2x~ 1 ' 2 + C(x), where we have taken the positive sign. Assume that C -C 2a; -1 ' 2 . Substituting 
in the equation, we have 

-x~ 5 / 2 + C" - 2x- 3 ' 2 C + (C) 2 = 0. (4.318) 

Since C < 2s" 1 / 2 , we have C" < x~ 3/2 and C" < (3/2)a;- 5 / 2 . Thus 

\x~ b/2 - 2x- 3/2 C = 0, (4.319) 

from which C = (3/4)a; _1 and C = (3/4) In a;. We can now check the assumption on C. 
We have S(x) = 2x~ 1 / 2 + (3/4) Ins, so that 

2/ = x 3/4 exp (--;= ) +••■■ (4.320) 

Another solution is obtained by taking S(x) = —2x~ 1 ' 2 + C{x). This procedure is similar to that of the 
WKBJ method, and the solution is identical. The exact solution is of course the same as the previous 
example. 

I 
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i 

0.8 
0.6 
0.4 
0.2 



First 
y Approximation, y = 1 - exp(-x) 




y' = exp (-xy) 
Numerical y (oo) = 1 

Repeated Substitution Method 



10 



Figure 4.17: Numerical and first approximate solution for repeated substitution problem. 

4.2.8 Repeated substitution 

This technique sometimes works if the range of the independent variable is such that some 
term is small. 



I 

Example 4-21 

Solve 



V = e xy , y(oo) -> c, c> 0, 



for y > and large x. 

As x — > oo, y' — > 0, so that y — > c. Substituting y = c into Eq. (|4.321j) . we get 

/ —ex 

2/ = e 
which can be integrated to get, after application of the boundary condition, 

y = c- -e~ cx . 
c 



Substituting Eq. (|4.323|) into the original Eq. (|4.32ip . we find 

1 



y = exp [ —x [ c e 



V c 



which can be integrated to give 



-ex I „ i \ „-2cx 



y = c- -e~ X+ Yc ]e ~ 



(4.321) 



(4.322) 
(4.323) 

(4.324) 
(4.325) 

(4.326) 



The series converges for large x. An accurate numerical solution along with the first approximation are 
plotted in Fig. 147171 

I 
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Problems 

1. Solve as a series in x for x > about the point x = 0: 

(a) x 2 y" - 2xy' + (x + l)y = 0; 2/(1) = 1, y(4) = 0. 

(b) xy" + y' + 2x 2 y = 0; |y(0)| < oo, 2/(1) = 1. 

In each case find the exact solution with a symbolic computation program, and compare graphically 
the first four terms of your series solution with the exact solution. 

2. Find two-term expansions for each of the roots of 

(x- l)(cc + 3)(x-3A) + l = 0, 

where A is large. 

3. Find two terms of an approximate solution of 

A 

v" + -7—y = °> 

A + x 

with j/(0) = 0,2/(1) = 1, where A is a large parameter. For A = 20, plot y(x) for the two-term 
expansion. Also compute the exact solution by numerical integration. Plot the difference between the 
asymptotic and numerical solution versus x. 



4. Find the leading order solution for 



< \ d y , 

\x-ey)—- + xy = e 
ax 



where j/(l) = 1, and x G [0,1], e <C 1. For e = 0.2, plot the asymptotic solution, the exact solution 
and the difference versus x. 

5. The motion of a pendulum is governed by the equation 

d 2 x 

— + sm(x) = 0, 

with x(0) = e, 4f (0) = 0. Using strained coordinates, find the approximate solution of x{i) for small e 
through 0(e 2 ). Plot your results for both your asymptotic results and those obtained by a numerical 
integration of the full equation. 

6. Find an approximate solution for 

y" - ye y/w = o, 

with 2/(0) = 1, 2/(1) = e. 

7. Find an approximate solution for the following problem: 

y _ ye v/v = o, with 2/(0) = 0.1, 2/(0) = 1.2. 

Compare with the numerical solution for < x < 1. 

8. Find the lowest order solution for 

e 2 y" + ey 2 - y + 1 = 0, 

with 2/(0) = 1, 2/(1) = 3, where e is small. For e = 0.2, plot the asymptotic and exact solutions. 
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9. Show that for small e the solution of 

d V t 

with 2/(0) = 1 can be approximated as an exponential on a slightly different time scale. 

10. Obtain approximate general solutions of the following equations near x = 0. 

(a) xy" + y' + xy = 0, through 0(x 6 ), 

(b) xy" + y = 0, through 0(x 2 ). 

11. Find all solutions through 0(e 2 ), where e is a small parameter, and compare with the exact result for 
e = 0.01. 

(a) 4x 4 + 4(e + l)x 3 + 3(2e - 5)x 2 + (2e - 16)x -4 = 0, 

(b) 2ex 4 + 2(2e + l)x 3 + (7 - 2e)x 2 - 5x - 4 = 0. 

12. Find three terms of a solution of 

IT 

x + e cos(:r + 2e) = — , 

where e is a small parameter. For e = 0.2, compare the best asymptotic solution with the exact 
solution. 

13. Find three terms of the solution of 

x + 2x + ex =0, with x(0) = coshe, 

where e is a small parameter. Compare graphically with the exact solution for e = 0.3 and < t < 2. 

14. Write down an approximation for 

n/2 



\/l + ecos 2 x dx, 
'o 

if e = 0.1, so that the absolute error is less than 2 x 10~ 4 . 

15. Solve 

y » + y = e ^in* : With y(0) = y(l) = 0, 

through O(e), where e is a small parameter. For e = 0.25 graphically compare the asymptotic solution 
with a numerically obtained solution. 

16. The solution of the matrix equation A ■ x = y can be written as x = A -1 • y. Find the perturbation 
solution of (A + eB) ■ x = y, where e is a small parameter. 

17. Find all solutions of ex + x — 2 = approximately, if e is small and positive. If e = 0.001, compare 
the exact solution obtained numerically with the asymptotic solution. 

18. Obtain the first two terms of an approximate solution to 

x + 3(1 + e)x + 2x = 0, with a:(0) = 2(1 + e), x(0) = -3(1 + 2e), 

for small e. Compare the approximate and exact solutions graphically in the range < x < 1 for (a) 
e = 0.1, (b) e = 0.25, and (c) e = 0.5. 

19. Find an approximate solution to 

x + (1 + e)x = 0, with x{0) = A, x(0) = B, 

for small, positive e. Compare with the exact solution. Plot both the exact solution and the approxi- 
mate solution on the same graph for A = 1, B = 0, e = 0.3. 
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20. Find an approximate solution to the following problem for small e 

e 2 y - y = -1, with y(0) = 0, y(l) = 0. 
Compare graphically with the exact solution for e = 0.1. 

21. Solve to leading order 

ey" + yy'-y = 7 with y(0) = 0, y(l) = 3. 

Compare graphically to the exact solution for e = 0.2. 

22. If x + x + ea; 3 = with x(0) = A, x(0) = where e is small, a regular expansion gives x(i) « 
^4 cost + e(^4 3 /32)(— cost + cos3£ — 12tsint). Explain why this is not valid for all time, and obtain 
a better solution by inserting t = (1 + a±e + . . .)r into this solution, expanding in terms of e, and 
choosing Oi, 02, • • • properly (Pritulo's method). 

23. Use perturbations to find an approximate solution to 

y" + Xy' = A, with y(0) = 0, y{\) = 0, 

where A> 1. 

24. Find the complementary functions of 

y'" -xy = 0, 

in terms of expansions near x = 0. Retain only two terms for each function. 

25. Find, correct to O(e), the solution of 

x + (1 + e cos 2t) x = 0, with x(0) = 1, and i(0) = 0, 

that is bounded for all t, where e<l. 

26. Find the function / to O(e) where it satisfies the integral equation 

fX+e sin a: 



I 



/(£) #• 

27. Find three terms of a perturbation solution of 

y" + ty 2 = 0, 

with y(0) = 0, y(l) = 1 for e -C 1. For e = 2.5, compare the 0(l),0(e), and 0(e 2 ) solutions to a 
numerically obtained solution in x G [0, 1]. 

28. Obtain a power series solution (in summation form) for y' + ky = about x = 0, where fc is an 
arbitrary, nonzero constant. Compare to a Taylor series expansion of the exact solution. 

29. Obtain two terms of an approximate solution for ee x = cos a; when e is small. Graphically compare 
to the actual values (obtained numerically) when e = 0.2,0.1,0.01. 

30. Obtain three terms of a perturbation solution for the roots of the equation (1 — e)x 2 — 2x + 1 = 0. 
(Hint: The expansion x = xq + ex\ + e 2 X2 + ■ ■ ■ will not work.) 

31. The solution of the matrix equation A • x = y can be written as x = A -1 ■ y. Find the n th term of 
the perturbation solution of (A + eB) ■ x = y, where e is a small parameter. Obtain the first three 
terms of the solution for 

1/10 1/2 1/10' 

1/5 

1/2 1/10 1/2 
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32. Obtain leading and first order terms for u and v, governed by the following set of coupled differential 

equations, for small e: 

d 2 u du , . ,,11 

_ + «,-=l l «(0) = 0,«(l) = - + — e, 

d 2 v dv , N ,,11 

^ + eU -=,,,(0) = 0, ,(!) = - + -, 

Compare asymptotic and numerically obtained results for e = 0.2. 

33. Obtain two terms of a perturbation solution to ef xx + f x = —e~ x with boundary conditions /(0) = 0, 
/(l) = 1. Graph the solution for e = 0.2, 0.1, 0.05, 0.025 on < x < 1. 

34. Find two uniformly valid approximate solutions of 

u> 2 u , . 

u H = = 0, with iiO = 0, 

1 + u 

up to the first order. Note that to is not small. 

35. Using a two-variable expansion, find the lowest order solution of 

(a) x + ex + x = with x(0) = 0, x{0) = 1, 

(b) x + ex 3 + x = with x(0) = 0, x(0) = 1. 

where e< 1. Compare asymptotic and numerically obtained results for e = 0.01. 

36. Obtain a three-term solution of 

ex-x = l, with x(0) = 0, x{l) = 2, 

where e< 1. 

37. Find an approximate solution to the following problem for small e 

e 2 y - y = -1 with y{0) = 0, y{\) = 0. 

Compare graphically with the exact solution for e = 0.1. 

38. A projectile of mass m is launched at an angle a with respect to the horizontal, and with an initial 
velocity V . Find the time it takes to reach its maximum height. Assume that the air resistance is 
small and can be written as k times the square of the velocity of the projectile. Choosing appropriate 
values for the parameters, compare with the numerical result. 

39. For small e, solve using WKBJ 

e 2 y" = (1 + x 2 ) 2 y 7 with i/(0) = 0, y(l) = 1. 

40. Obtain a general series solution of 



about x = 0. 
41. Find a general solution of 

near x = 0. 
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y" + k 2 y = 0, 



y +e y = 1, 
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42. Solve 

x 2 y" + x ( - + 2xjy' + (x - - J y = 0, 

around x = 0. 

43. Solve y" — \fxy = 0, x > in each one of the following ways: 

(a) Substitute x = e" 4/5 X, and then use WKBJ. 

(b) Substitute x = e 2 ' 5 X, and then use regular perturbation. 

(c) Find an approximate solution of the kind y = e ( x \ 

where e is small 

44. Find a solution of 

y"' - yfcy = 0, 

for small x > 0. 

45. Find an approximate general solution of 

(a; sin x) y" + (2x cos x + x sin x) y' + (x sin x + sin x + x cos x) y = 0, 

valid near x = 0. 

46. A bead can slide along a circular hoop in a vertical plane. The bead is initially at the lowest position, 
9 = 0, and given an initial velocity of 2y/gR, where g is the acceleration due to gravity and R is the 
radius of the hoop. If the friction coefficient is fx, find the maximum angle 9 max reached by the bead. 
Compare perturbation and numerical results. Present results on a m ax vs. [i plot, for < \x < 0.3. 

47. The initial velocity downwards of a body of mass m immersed in a very viscous fluid is V. Find 
the velocity of the body as a function of time. Assume that the viscous force is proportional to the 
velocity. Assume that the inertia of the body is small, but not negligible, relative to viscous and 
gravity forces. Compare perturbation and exact solutions graphically. 

48. For small e, solve to lowest order using the method of multiple scales 

x + ex + x = 0, with x(0) = 0, i(0) = 1. 

Compare exact and asymptotic results for e = 0.3. 

49. For small e, solve using WKBJ 

e y = (1 + x 2 ) 2 y, with y(0) = 0, 2/(1) = 1. 

Plot asymptotic and numerical solutions for e = 0.11. 

50. Find the lowest order approximate solution to 

e 2 y" + ey 2 - y + 1 = 0, with y(0) = 1, y{\) = 2, 

where e is small. Plot asymptotic and numerical solutions for e = 0.23. 

51. A pendulum is used to measure the earth's gravity. The frequency of oscillation is measured, and the 
gravity calculated assuming a small amplitude of motion and knowing the length of the pendulum. 
What must the maximum initial angular displacement of the pendulum be if the error in gravity is 
to be less than 1%. Neglect air resistance. 
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52. Find two terms of an approximate solution of 

y " + 7— y = °> 

A + x 
with 2/(0) = 0,2/(1) = 1, where A is a large parameter. 

53. Find all solutions of e ex = x 2 through 0(e 2 ), where e is a small parameter. 

54. Solve 

(1 + e)y" + ey 2 = 1, 

with 2/(0) = 0, 2/(1) = 1 through 0(e 2 ), where e is a small parameter. 

55. Solve to lowest order 

II i I 2 -i 

ey +y +ey =1, 

with y(0) = —l,y(l) = 1, where e is a small parameter. For e = 0.2, plot asymptotic and numerical 
solutions to the full equation. 

56. Find the series solution of the differential equation 

y" + xy = 0, 

around x = up to four terms. 

57. Find the local solution of the equation 

y" = Vxy, 

near x — > + . 

58. Find the solution of the transcendental equation 

sin a; = e cos 2x, 

near x = n for small positive e. 

59. Solve 

ii i -i 

with 2/(0) = 0,2/(1) = 2 for small e. Plot asymptotic and numerical solutions for e = 0.04. 

60. Find two terms of the perturbation solution of 

(l + ey)y" + ey ,2 -N 2 y = 0, 

with 2/'(0) = 0,2/(1) = 1- for small e. N is a constant. Plot the asymptotic and numerical solution for 
e = 0.12, N = 10. 

61. Solve 

ey" + y' = \, 

with 2/(0) = 0,2/(1) = 1 for small e. Plot asymptotic and numerical solutions for e = 0.12. 

62. Find if the van der Pol equation 

y - e(l - y 2 )y + k 2 y = 0, 

has a limit cycle of the form y = A cos cot. 

63. Solve 2/' = e _2a:?/ for large x where y is positive. Plot y(x). 

\CC BY-NC-THJ} 29 July 2012, Sen & Powers. 



Chapter 5 

Orthogonal functions and Fourier 
series 



see Kaplan, Chapter 7, 

see Lopez, Chapters 10, 16, 

see Riley, Hobson, and Bence, Chapter 15.4, 15.5. 

Solution of linear differential equations gives rise to complementary functions. Some of these 
are well known, such as sine and cosine. This chapter will consider these and other functions 
which arise from the solution of a variety of linear second order differential equations with 
constant and non-constant coefficients. The notion of eigenvalues, eig en functions, orthogonal, 
and orthonormal functions will be introduced; a stronger foundation will be built in Chapter [7] 
on linear analysis. A key result of the present chapter will be to show how one can expand 
an arbitrary function in terms of infinite sums of the product of scalar amplitudes with 
orthogonal basis functions. Such a summation is known as a Fouriero series. 

5.1 Sturm-Liouville equations 

Consider on the domain x G [ccoj^i] the following general linear homogeneous second order 
differential equation with general homogeneous boundary conditions: 

d v dy 

a { x )-r^ + K x )-j- + c i x )y + ty = o, (5.1) 

aiy(x ) + a 2 y'(x ) = 0, (5.2) 

PMx 1 )+p 2 y'(x 1 ) = 0. (5.3) 



Define the following functions: 



^fs) dS 



1 Jean Baptiste Joseph Fourier, 1768-1830, French mathematician. 
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rix) = «*)*** \Lw)' (5 - 5) 

*(*) - ^exp/T^A (5 . 6) 

a{x) \J Xo a{s) J 



With these definitions, Eq. ( 15.1ft is transformed to the type known as a Sturm- Liouvillqj 
equation: 

£ ( P(X) S) + (g(x) + Ar(x)) y(x) = °' (5 - ?) 

P(aO;H +q(x)))y(x) = -\y(x). (5.8) 



r(x) \dx \ dx 



Here the Sturm- Liouville linear operator L s is 



r p(*)4- ) 



r(x) \dx \ dx 

so we have Eq. (15 ,8ft compactly stated as 

L s y{x) = -A y(x). (5.10) 

It can be shown that L s is what is known as a self-adjoint linear operator; see Sec. 17.4.21 
What has been shown then is that all systems of the form of Eqs. (15. 1115 .3|) can be transformed 
into a self-adjoint form. 

Now the trivial solution y(x) = will satisfy the differential equation and boundary 
conditions, Eqs. ( I5.1H5.3|) . In addition, for special real values of A, known as eigenvalues, 
there are special non-trivial functions, known as eigenf unctions which also satisfy Eqs. (15. It - 
15. 3p . Eigenvalues and eigenfunctions will be discussed in more general terms in Sec. 17.4.41 

Now it can be shown that if we have for x £ [x , x±] 

p(x) > 0, (5.11) 

r(x) > 0, (5.12) 

q(x) > 0, (5.13) 

then an infinite number of real positive eigenvalues A and corresponding eigenfunctions y n (x) 
exist for which Eqs. (15.lH5.3p are satisfied. Moreover, it can also be shown (Hildebrand, 
p. 204) that a consequence of the homogeneous boundary conditions is the orthogonality 
condition: 

<y n ,y m > = r(x)y n (x)y m (x) dx = 0, for n ^ m, (5.14) 

J XQ 

j-Xl 

<y n ,y n >= r(x)y n (x)y n (x) dx = K 2 . (5.15) 

J x 



^Jacques Charles Frangois Sturm, 1803-1855, Swiss-born French mathematician and Joseph Liouville, 
1809-1882, French mathematician. 
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Consequently, in the same way that in ordinary vector mechanics i ■ j = 0, i ■ k = 0, i • i = 1 
implies i is orthogonal to j and k, the eigenf unctions of a Sturm-Liouville operator L s are 
said to be orthogonal to each other. The so-called inner product notation, <•,•>, will be 
explained in detail in Sec. 17.3.21 Here K G R 1 is a real constant. This can be written 
compactly using the Kronecker delta function, 5 nm as 

/•Xl 

/ r(x)y n (x)y m (x) dx = K 2 5 nm . (5.16) 

Jx 

Sturm-Liouville theory shares many more analogies with vector algebra. In the same sense 
that the dot product of a vector with itself is guaranteed positive, we have defined a "product" 
for the eigenfunctions in which the "product" of a Sturm-Liouville eigenfunction with itself 
is guaranteed positive. 

Motivated by Eq. Q5.16]) . we can define functions (f n {x): 



<p n (x) = +^- y n (x), (5.17) 

so that 

<ip n ,ip m >= (p n (x)ip m (x) dx = 5 nm . (5.18) 

Jxo 

Such functions are said to be orthonormal, in the same way that i, j, and k are or- 
thonormal. While orthonormal functions have great utility, note that in the context of our 
Sturm-Liouville nomenclature, that tp n (x) does not in general satisfy the Sturm-Liouville 
equation: h s tp n (x) ^ —\ n (p n (x). If, however, r(x) = C, where C is a scalar constant, then 
in fact L s (p n (x) = —\ n ip n (x). Whatever the case, we are guaranteed h s y n (x) = —\ n y n (x). 
The y n (x) functions are orthogonal under the influence of the weighting function r(x), but 
not necessarily orthonormal. The following sections give special cases of the Sturm-Liouville 
equation with general homogeneous boundary conditions. 

5.1.1 Linear oscillator 

A linear oscillator gives perhaps the simplest example of a Sturm-Liouville problem. We will 
consider the domain x G [0,1]. For other domains, we could easily transform coordinates; 
e.g. if x G [x , Xi], then the linear mapping x = (x — x )/(xi — xq) lets us consider x G [0, 1]. 
The equations governing a linear oscillator with general homogeneous boundary condi- 
tions are 



dx 2 
Here we have 



Xy = 0, ony{0) 



o»|(0) - 0. 


/Mi) + #rr(i) = o. 

dx 


(5.19) 


a(x) = 1, 




(5.20) 


b{x) = 0, 




(5.21) 


c(x) = 0, 




(5.22) 
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p(x) ~- 


= exp 1 


r{x) = 


1 

= Y exp 


q(x) = 




= Y exp 
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so 

2 ds) = e° = 1, (5.23) 

x o \ 

^d s j =e ° = l 7 (5.24) 

* \ 

- ds) =0. (5.25) 

So, we can consider the domain x G (—00,00). In practice it is more common to consider 
the finite domain in which x G [0, 1]. The Sturm-Liouville operator is 

L. = £. (5.26) 

The eigenvalue problem is 

j- 2 y(x) = -A y(x). (5.27) 

We can find a series solution by assuming y = J^^Lo a n xTi ■ This leads us to the recursion 
relationship 

fl ™+ 2 = ( x^"x9V (5 - 28) 

(n + l)(n + 2) 

So, given two seed values, a^ and 01, detailed analysis of the type considered in Sec. 14.1.21 
reveals the solution can be expressed as the infinite series 

(V\x) 2 {s/XxY 



\ ( r- (VXx) 3 (y/Xa 

j +ai ^- 3! + 5f 



y(x) = a [ 1 - 2 , + 4 , - . . . ) +01 ( VAa; - - + - -...). (5.29) 



cos(vAai) sin(vAa;) 

The series is recognized as being composed of linear combinations of the Taylor series for 
cos(v / A:r) and sin(v / Aa;) about x = 0. Letting a = Ci and ai = C 2 , we can express the 
general solution in terms of these two complementary functions as 



y 



(x) = d cos(v / Ax) + C 2 sin(\/Ax). (5.30) 



Applying the general homogeneous boundary conditions from Eq. ( 15. 19ft leads to a chal- 
lenging problem for determining admissible eigenvalues A. To apply the boundary conditions, 
we need dy/dx, which is 



dy_ 

dx 
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JL = -d^Xsinfy/Xx) + C 2 V\ cos{\f\x) . (5.31) 

dx 
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Enforcing the boundary conditions at x = and x = 1 leads us to two equations: 

aid + a 2 VxC 2 = 0, (5.32) 
d (pi cos y/\ - p 2 VX sin Vx) + C 2 (pi sin \/A + /3 2 \/A cos \f\\ = 0. (5.33) 

This can be posed as the linear system 

oil a. 2 \/~X \ ( C\\ f 0\ 

(pi cos VX - p 2 VX sin y/\\ (ft sin \/A + /5 2 VX cos \/a) J ' \ C 2 J = V J ' ^ 5 ' 3 ^ 

For non-trivial solutions, the determinant of the coefficient matrix must be zero, which leads 
to the transcendental equation 

ai (/?isinv / A + / 5 2 v / Acosv / A) - a 2 \f\~ (pi cos Vx - p 2 Vx sin \fx) =0. (5.35) 

For known values of «i, a 2 , Pi, and pi, one seeks values of A which satisfy Eq. fj5.35|) . This 
is a solution which in general must be done numerically, except for the simplest of cases. 

One important simple case is for a± = 1, a 2 = 0, Pi = 1, p 2 = 0. This gives the boundary 
conditions to be y(0) = y(l) = 0. Boundary conditions where the function values are 
specified are known as Dirichletj conditions. In this case, Eq. ( 15.35|) reduces to sin VA = 0, 
which is easily solved as vA = rnr, with n = 0, ±1, ±2, . . .. We also get C\ = 0; consequently, 
y = C 2 sm(mrx). Note that for n = 0, the solution is the trivial y = 0. 

Another set of conditions also leads to a similarly simple result. Taking on = 0, a 2 = 1, 
Pi = 0, P 2 = 1, the boundary conditions are y'(0) = y'(l) = 0. Boundary conditions 
where the function's derivative values are specified are known as Neumann^ conditions. In 
this case, Eq. (15.351) reduces to — Asin\/A = 0, which is easily solved as \fX = nir, with 
n = 0, ±1, ±2, . . .. We also get C 2 = 0; consequently, y = C\ cos(n-7nr). Here, for n = 0, the 
solution is the non-trivial y = C\. 

Some of the eigenf unctions for Dirichlet and Neumann boundary conditions are plotted 
in Fig. 15.11 Note these two families form the linearly independent complementary functions 
of Eq. (I5.19p . Also note that as n rises, the number of zero-crossings within the domain 
rises. This will be seen to be characteristic of all sets of eigenfunctions for Sturm-Liouville 
equations. 



I 

Example 5.1 

Find the eigenvalues and eigenfunctions for a linear oscillator equation with Dirichlet boundary 
conditions: 

fy 



T 4 + Ay = 0, y(0) = y (£) = 0. (5.36) 



3 Johann Peter Gustav Lejeune Dirichlet, 1805-1859, German mathematician who formally defined a func- 
tion in the modern sense. 

4 Carl Gottfried Neumann, 1832-1925, German mathematician. 
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cos(rmx) 



cos(0ttx)=1 



x 




Figure 5.1: Solutions to the linear oscillator equation, Eq. ( 15.19ft . in terms of two sets of 
complementary functions, sin(n7r:r) and cos(n7r:r). 



We could transform the domain via x = xjl so that x G [0,1], but this problem is sufficiently 
straightforward to allow us to deal with the original domain. We know by inspection that the general 
solution is 

y(x) = C\ cos(vAx) + C2 sin(vAx). 



For y(0) =0, we get 



y(0) = = Cicos(VA(0)) + C 2 sin(VA(0)), 

= C 1 (l) + C7 2 (0), 
d = 0. 



So 



y(x) = C2sin(VAx). 
At the boundary at x = (. we have 

y(£) = = C 2 sm(y / j£). 

For non-trivial solutions we need C2 ^ 0, which then requires that 

y/\£ = mr n = ±1,±2, ±3, . . ., 



The eigenvalues and eigenfunctions are 



A n 



f 2 



and 



respectively. 



. /nwx\ 
y n (x) = sin \—jr) 



(5.37) 

(5.38) 
(5.39) 
(5.40) 

(5.41) 
(5.42) 

(5.43) 
(5.44) 

(5.45) 
(5.46) 
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Check orthogonality for 2/2(2;) and 2/3 (x). 

I = /V^Vf^W (5-47) 



^0 v I ) V I / 
t ( ckx\ 1 /57rx\\ 

= ^r(-)-5 sin (— ))o' M8) 

= 0. (5.49) 

Check orthogonality for y^(x) and 2/4(2;). 

/47Tx\ / 47TX \ 

sin I — — I sin I — — I dx, (5.50) 



11 



V t J V 1 ) 



x £ (8nx\ 

sin — — 

2 16tt V t J 



In fact 



11 



(5.51) 
(5.52) 



sin (— J sin (— ) dx= r (5 - 53) 

so the orthonormal functions <p n (x) for this problem are 

, . [2 /nirx\ 

Vn{x) = \jjs™{—)- (5.54) 

With this choice, we recover the orthonormality condition 

<p n (x)tp m (x) dx = 5 nm , (5.55) 



2 / /mrx\ /rmrx\ 

- / sm I — — I sm I — — I dx = d nm . (5.56) 



J 



5.1.2 Legendre's differential equation 

Legendre'qj differential equation is given next. Here, it is convenient to let the term n{n + 1) 
play the role of A. 

{l-x 2 )—t-2x— +n(n + l)y = 0. (5.57) 

ax 2 ax 



5 Adrien-Marie Legendre, 1752-1833, French/Parisian mathematician. 
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Here 

a(x) = 1-x 2 , (5.58) 



b(x) = -2x, (5.59) 

c(x) = 0. (5.60) 



Then, taking x = — 1, we have 



r -2s 

p(x) = exp / _ 2 rfs, (5.61) 

= exp(ln(l-s 2 ))|^ 1 , (5.62) 

= (i-OI!i. ( 5 - 63 ) 

= 1-x 2 . (5.64) 



We find then that 



r(x) = 1, (5.65) 

q(x) = 0. (5.66) 

Thus, we require x G (— 1, 1). In Sturm- Liouville form, Eq. (15.570 reduces to 

±((l- x 2 )^.)+n(n + l)y = 0, (5.67) 

ax \ ax J 

■f ( (1 - x 2 ) -f) y(x) = -n(n + l)y(x). (5.68) 

ax \ ax J 



h s 



So 



L s = ±((l-x 2 )j-). (5.69) 

Now a; = is a regular point, so we can expand in a power series around this point. Let 

oo 

y = J2 *mx m . (5.70) 



m=0 



Substituting into Eq. 05.57]) . we find after detailed analysis that 



_ (m + n + l)(m-n) 

fl m+2 " "m -, — 7T-, — TT • \^- il , 

(m + l)(m + 2) 
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With ao and a\ as given seeds, we can thus generate all values of a m for m > 2. We find 
y{x) = oo(l-n(n + l)||- + n(n+l)(n-2)(n + 3)^--. 



V 

2/i(a0 



+a 1 [x-(n- l)(n + 2)— + (n - l)(n + 2)(n - 3)(n + 4)— -...). (5.72) 



2/2(3:) 
Thus, the general solution takes the form 

y(x) = aoyi(x) + a 1 y 2 {x), (5.73) 

with complementary functions yi(x) and y2{x) defined as 

2 4 

y x {x) = l-n(n+l) — + n(n+ l)(n- 2)(n + 3)^j- - ..., (5.74) 

y 2 (x) = x-(n-l)(n + 2)| r + (n-l)(n + 2)(n-3)(n + 4)| r -.... (5.75) 

3! 5! 

This solution holds for arbitrary real values of n. However, for n = 0, 2, 4, . . ., y±(x) is a finite 
polynomial, while y2{x) is an infinite series which diverges at |x| = 1. For n = 1, 3, 5, . . ., it 
is the other way around. Thus, for integer, non- negative n either 1) yi is a polynomial of 
degree n, and y 2 is a polynomial of infinite degree, or 2) y± is a polynomial of infinite degree, 
and y 2 is a polynomial of degree n. 

We could in fact treat y± and 1/2 as the complementary functions for Eq. (15.570 . However, 
the existence of finite degree polynomials in special cases has led to an alternate definition 
of the standard complementary functions for Eq. (15.570 . The finite polynomials (y± for even 
n, and y 2 for odd n) can be normalized by dividing through by their values at x = 1 to give 
the Legendre polynomials, P n {x): 

for n even, 
for n odd. 



}, , 101 n even, 
Pn{x) = { £g' r ___ __' (5.76) 



2/2(1) 



The Legendre polynomials are thus 



n : 


= 0, 


P (x) = 


= 1, 








(5.77) 


n 


= 1, 


P 1 (x) - 


= x, 








(5.78) 


n 


= 2, 


P*(x) = 


= ^(3x 2 -l), 








(5.79) 


n 


= 3, 


Pfc) - 


= i(5x 3 -3x), 








(5.80) 


n : 


= 4, 


Pa(x) = 


= -(35x 4 -30x 2 + 3), 
8 








(5.81) 




n, 


Pn{x) = 


1 r\ n 

(x 2 l) n 
2 n n\dx nK ' ' 


Rodrigi 


ies' formula. 
29 July 2012, 


(5.82) 
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The Rodrigueqj formula gives a generating formula for general n. 
The orthogonality condition is 

f 1 2 

/ P n (x)P m (x) dx = - — — -5 nm . (5.83) 

J -i 2n + 1 

Direct substitution shows that P n {x) satisfies both the differential equation, Eq. ( 15.57ft . 
and the orthogonality condition. It is then easily shown that the following functions are 
orthonormal on the interval x G (—1,1): 



n+-P n (x), (5.84) 



giving 

-l 



ip n (x)(p m (x)dx = 8 nm . (5.85) 

-l 

The total solution, Eq. (15.73J1 . can be recast as the sum of the finite sum of polynomials 

P n {x) (Legendre functions of the first kind and degree n) and the infinite sum of polynomials 

Q n (x) (Legendre functions of the second kind and degree n): 

y(x) = C 1 P n (x) + C 2 Q n (x). (5.86) 

Here Q n (x), the infinite series portion of the solution, is obtained by 

Q( x ) = { Vi( l )y^), for n even, 
\ — 2/2(1)2/1 (x), for n odd. 

One can also show the Legendre functions of the second kind, Q n (x), satisfy a similar orthog- 
onality condition. Additionally, Q n (±l) is singular. One can further show that the infinite 
series of polynomials which form Q n (x) can be recast as a finite series of polynomials along 
with a logarithmic function. The first few values of Q n (x) are in fact 

n = 0, Q (x) = Iln(i±£), (5.88) 

n = l, Q 1 (x) = |ln^i±|j-l, (5.89) 



/; 



0, 


Qo(x) 


1. 


Qi(x) 


2. 


Q 2 (x) 


3. 


Qz(x) 



3x 2 -l /l+x\ 3 . n . 

In - -x, (5.90) 



4 \l-xj 2 

5x 3 -3x, fl + x\ 5 , 2 



n = 3, Q 3 (x) ^—^lnll—r-)-^x 2 + ^ (5.91) 



The first few eigenfunctions of Eq. (15.570 for the two families of complementary functions 
are plotted in Fig. 15.21 



^Benjamin Olinde Rodrigues, 1794-1851, obscure French mathematician, of Portuguese and perhaps 
Spanish roots. 
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P n (x) 










2 


- 




P n 












P*--^^' 




-1 )<f 


-1 
-2 


- 


\ 




y 7 1 




Figure 5.2: Solutions to the Legendre equation, Eq. ( 15.57ft . in terms of two sets of comple- 
mentary functions, P n (x) and Q n (x). 



5.1.3 Chebyshev equation 

The Chebyshe\|3 equation is 






Let's get this into Sturm-Liouville form. 



Now, taking xq = —1, 



a(x) = 1 — x 2 , 
b(x) = —x, 
c(x) = 0. 



p(x) = exp 



o(s) 



(7.S 



exp 



1-5 2 



-ds 



exp ( - ln(l — s^ 



VT 



VT 



r(x) 



exp [j x -i h ^) ds ) 



x , 



a(x) 



VT 



X 



2' 



g(x) = 0. 



IPafnuty Lvovich "C hebyshev, 1821-1894, Russian mathematician. 



(5.92) 



(5.93) 
(5.94) 
(5.95) 



(5.96) 
(5.97) 

(5.98) 

(5.99) 
(5.100) 

(5.101) 
(5.102) 
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Thus, for p(x) > 0, we require x G ( — 1, 1). The Chebyshev equation, Eq. (15.92ft . in Sturm- 
Liouville form is 

^(^D + T!^ - °' (5 ' 103) 

^rzr^A- (^T^2 A. ) y f x ) = -x y (x). (5.104) 

ax \ ax J 



Thus, 



L s = y/1^4- (Vf^z* 4- ) ■ (5-105) 

ax \ ax 



That the two forms are equivalent can be easily checked by direct expansion. 

Series solution techniques reveal for eigenvalues of A one family of complementary func- 
tions of Eq. (15.920 can be written in terms of the so-called Chebyshev polynomials, T n (x). 
These are also known as Chebyshev polynomials of the first kind. These polynomials can be 
obtained by a regular series expansion of the original differential equation. These eigenvalues 
and eigenfunctions are listed next: 

(5.106) 
(5.107) 
(5.108) 
(5.109) 
(5.110) 

A = n 2 , T n (x) = cos(ncos _1 x), Rodrigues' formula. (5.111) 

The orthogonality condition is 

1 T n {x)T m {x) dx = (n5 ifn = {5m) 



A = 0, 


T (x) = 


= i, 


A = l, 


Ti(x) = 


= X, 


A = 4, 


T 2 (x) = 


= -l + 2a; 2 , 


A = 9, 


Ux) - 


= -Sx + Ax 3 , 


A = 16, 


r 4 (x) = 


= 1 - 8x 2 + 8x 4 , 



i Vl - x 2 \^ nm , if n = 1,2, .... ' 

Direct substitution shows that T n (x) satisfies both the differential equation, Eq. (I5.92|) . and 
the orthogonality condition. We can deduce then that the functions <p n (x) 




7= — a T n (x), if n = 0, 

Vn{x)=-{ V /^- , N • (5-113) 

^^T n (x), ifn = l,2,... 

are an orthonormal set of functions on the interval x G (—1, 1). That is, 

(p n (x)<p m (x)dx = 5 nm . (5.114) 
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The Chebyshev polynomials of the first kind, T n (x) form one set of complementary func- 
tions which satisfy Eq. (I5.92[) . The other set of complementary functions are V n (x), and can 
be shown to be 



A = 0, 
A=l, 

A = 4, 

A = 9, 

A = 16, 



V (x 

V 2 (x 
V 3 (x 
V 4 (x 



0, 

VY 



;ir 



Vl-x 2 (2x), 
Vl-x 2 (-l+4x 2 ] 



VT 



x- 



Ax 2 + 8x 3 ), 



(5.115) 
(5.116) 
(5.117) 
(5.118) 
(5.119) 



A = n 2 , 



V n (x) 



sin (n cos l x 



Rodrigues' formula. 



(5.120) 



The general solution to Eq. ( 15.2141) is a linear combination of the two complementary func- 
tions: 



y(x) = C 1 T n (x) + C 2 V n (x). 



(5.121) 



One can also show that V n (x) satisfies an orthogonality condition: 



1 V n {x)V m {x)_ _ 7T 



i VT 



;ir 



(5.122) 



The first few eigenfunctions of Eq. (15.920 for the two families of complementary functions 
are plotted in Fig. 15.31 




T 2 T 3 '4 



-2 L 




Figure 5.3: Solutions to the Chebyshev equation, Eq. ( I5.92p . in terms of two sets of comple- 
mentary functions, T n (x) and V n (x). 
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5.1.4 Hermite equation 

The Hermitaj equation is discussed next. There are two common formulations, the physicists' 
and the probabilists'. We will focus on the first and briefly discuss the second. 

5.1.4.1 Physicists' 

The physicists' Hermite equation is 

P^-2x^- + Xy = 0. (5.123) 

ax 2 ax 

We find that 

p( x ) = e~ x \ (5.124) 

r( x ) = e~ x \ (5.125) 

q(x) = 0. (5.126) 

Thus, we allow x G (—00,00). In Sturm- Liouville form, Eq. ( 15.1231) becomes 

e "i P 1) >v = - A ^ < 5 - 128 > 



So 

L s = e* 2 -f fe- 2 AV (5.129) 

ax \ ax/ 

One set of complementary functions can be expressed in terms of polynomials known as the 
Hermite polynomials, H n (x). These polynomials can be obtained by a regular series expan- 
sion of the original differential equation. The eigenvalues and eigenfunctions corresponding 
to the physicists' Hermite polynomials are listed next: 

(5.130) 
(5.131) 
(5.132) 
(5.133) 
(5.134) 

(5.135) 
Rodrigues' formula. (5.136) 

6 Charles Hermite, 1822-1901, Lorraine-born French mathematician. 
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A = 0, 


H (x) = 


= 1, 


A = 2, 


#i(x) = 


= 2x, 


A = 4, 


H 2 (x) = 


= -2 + 4x 2 , 


A = 6, 


H 3 (x) = 


= -12x + 8x 3 , 


A = 8, 


H A (x) = 


= 12 - 48x 2 + 16x 4 , 


A =2n, 


H n (x) = 


v ' dx n 



5. 1 . STURM-LIO UVILLE EQ UATIONS 161 

The orthogonality condition is 

/oo 
e~ x2 H n {x)H m {x) dx = 2 n n\^5 nm (5.137) 

■oo 

Direct substitution shows that H n {x) satisfies both the differential equation, Eq. (I5.123|) . 
and the orthogonality condition. It is then easily shown that the following functions are 
orthonormal on the interval x £ (—00,00): 

<p n (x) = ; " n[X \ (5.138) 



giving 

/oo 
ip n (x)(p m (x)dx = S mn . (5.139) 

■00 

The general solution to Eq. (I5.123P is 

y{x) = C x H n {x) + C 2 H n {x), (5.140) 

where the other set of complementary functions is H n (x). For general n, H n (x) is a ver- 
sion of the so-called Kummer confluent hypergeometric function of the first kind H n (x) = 
1.F1 (— n/2; 1/2; x 2 ). Note, this general solution should be treated carefully, especially as the 
second complementary function, H n (x), is rarely discussed in the literature, and notation is 
often non-standard. For our eigenvalues of n, somewhat simpler results can be obtained in 
terms of the imaginary error function, erfi(:r); see Sec. 110.7.41 The first few of these functions 
are 



A = 0, n = 0, H (x) ^erfi(x), (5.141) 

A = 2, n=l, Hi(x) = e^-Vrferfi^), (5.142) 

A = 4, n = 2, H 2 (x) = -xe x2 + V^ erfi(x) (x 2 -)-\ , (5.143) 

A = 6, n = 3, H 3 (x) = e x2 (l - x 2 ) + Vttx 1 erfi(a;) ( x 2 - - J . (5.144) 

The first few eigenf unctions of the Hermite equation, Eq. (I5.123J1 . for the two families of 
complementary functions are plotted in Fig. 15.41 

5.1.4.2 Probabilists' 

The probabilists' Hermite equation is 
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Figure 5.4: Solutions to the physicists' Hermite equation, Eq. ( 15.123ft . in terms of two sets 
of complementary functions H n (x) and H n (x). 



We find that 



p(x) 

r(x) 
q(x) 



-x 2 /2 

-x 2 /2 



0. 



Thus, we allow x G (—00,00). In Sturm- Liouville form, Eq. ( 15.1451) becomes 

0. 
-A y(x). 



d f. xV2 dy^ Xe _ x2/2 
ax V ax 



d 



ax 



-x 2 /2 d 



Tx ] y[x) 






So 



x 2 /2_^_ 

dx 



-x 2 /2 



d_ 

dx 



(5.146) 
(5.147) 
(5.148) 



(5.149) 
(5.150) 



(5.151) 



One set of complementary functions can be expressed in terms of polynomials known as the 
probabilists' Hermite polynomials, He n (x). These polynomials can be obtained by a regular 
series expansion of the original differential equation. The eigenvalues and eigenf unctions 
corresponding to the probabilists' Hermite polynomials are listed next: 



A = 0, He (x) = 1, 

A = 1, Hei(x) = x, 

A = 2, He 2 (x) = -1+x 2 , 

A = 3, Hes(x) = — Sx + x 3 , 



(5.152) 
(5.153) 
(5.154) 
(5.155) 
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A = 4, He^x) = 3-6x 2 + x A , (5.156) 

i (5.157) 

jrip-x 2 /2 

X = n, He n (x) = {-l) n e x2/2 — , Rodrigues' formula. (5.158) 

The orthogonality condition is 

/oo 
e- x2/2 He n (x)He m (x) dx = n\Vzir5 nm (5.159) 

-oo 

Direct substitution shows that He n (x) satisfies both the differential equation, Eq. (j5.145p . 
and the orthogonality condition. It is then easily shown that the following functions are 
orthonormal on the interval x G (—00,00): 

¥>»(*) = e ~ X2/4 ^ X \ (5.160) 

VV27rn! 

giving 

/■oo 

V n {x)ip m (x)dx = 8 mn . (5.161) 



Plots and the second set of complementary functions for the probabilists' Hermite equation 
are obtained in a similar manner to those for the physicists'. One can easily show the relation 
between the two to be 

He n (x) = 2- n ' 2 H n (-^=\ . (5.162) 

5.1.5 Laguerre equation 

The Laguerrdj equation is 



We find that 



x^| + (l-x)^ + Ay = 0. (5.163) 

dx z dx 



(5.164) 
(5.165) 
(5.166) 

Thus, we require x G (0, 00). 

s Edmond Nicolas Laguerre, 1834-1886, French mathematician. 
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p(x) = 


= xe x , 


r(x) = 


- e~\ 


q(x) = 


= 0. 
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In Sturm-Liouville form, Eq. (I5.163[) becomes 

ax \ ax J 

e x —(xe- x —)y(x) = -A y{x). 

LLJL \ LLJL / 



(5.167) 
(5.168) 



So 



r d ( d 

e — [ xe — 
ax \ ax 



(5.169) 



One set of the complementary functions can be expressed in terms of polynomials of finite 
order known as the Laguerre polynomials, L n {x). These polynomials can be obtained by a 
regular series expansion of Eq. (j5.163p . Eigenvalues and eigenf unctions corresponding to the 
Laguerre polynomials are listed next: 



A = 0, 
A = l, 

A = 2, 
A = 3, 
A = 4, 

A = n, 



L (x 
L\(x 

L 2 (x 
L 3 (x 
L 4 (x 

L n {x) 



1, 

1 — x, 

1 9 

1 - 2x + -x, 

3 9 

1 - 3x + -x 2 - 



6' 



-x 



1 



1 - 4x + 3x z x 6 H x 

3 24 



1_ x d n (x n e- x ) 
n\ 



dx n 



Rodrigues' formula. 



The orthogonality condition reduces to 

POO 

/ 6 L n [x)L m [x) dx = d nm . 
Jo 



(5.170) 
(5.171) 

(5.172) 

(5.173) 

(5.174) 

(5.175) 
(5.176) 

(5.177) 



Direct substitution shows that L n (x) satisfies both the differential equation, Eq. (I5.163[) . 
and the orthogonality condition. It is then easily shown that the following functions are 
orthonormal on the interval x G (0, oo): 



so that 



ip n (x) = e x/2 L n (x), 
if n {x)if m {x)dx = 5 n 



(5.178) 
(5.179) 
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The general solution to Eq. (I5.163P is 

y{x) = CiL n (x) + C 2 L n (x), 



(5.180) 



where the other set of complementary functions is L n (x). For general n, L n {x) = U(—n, l,x), 
one of the so-called Tricomi confluent hypergeometric functions. Again the literature is not 
extensive on these functions. For integer eigenvalues n, L n (x) reduces somewhat and can be 
expressed in terms of the exponential integral function, Ei(x), see Sec. 110.7.61 The first few 
of these functions are 



A = n = 0, 
A = n = 1, 

A = n = 2, 
A = n = 3, 



L (x) 
Lt(x) 

L 2 (x) 
L(x) 



Ei(x), 

-e x -Ei(x)(l-x) 



-{e*(3-x) 



Ei(x) {2-Ax + x 2 )) , 



(5.181) 
(5.182) 

(5.183) 



1 
36 



(e x (-11 + 8a; - x 2 ) + Ei(x) (-6 + 18x - 9x 2 + x 3 )) 



(5.184) 



The first few eigenfunctions of the Laguerre equation, Eq. (15.1630 . for the two families of 
complementary functions are plotted in Fig. 15.51 



io - 




Figure 5.5: Solutions to the Laguerre equation, Eq. (I5.163J1 . in terms of two sets of comple- 
mentary functions, L n (x) and L n (x). 



5.1.6 Bessel's differential equation 
5.1.6.1 First and second kind 



Bessel'a I differential equation is as follows, with it being convenient to define A 



-v . 



> d 2 y , dy 
dx 2 



x — — + x— + (/J x - v )y = 0. 
ax 



(5.185) 



lc Friedrich Wilhelm Bessel, 1784-1846, Westphalia-born German mathematician. 
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We find that 

p(x) = x, (5.186) 

r(x) = -, (5.187) 

x 

q(x) = [i 2 x. (5.188) 

We thus require x G (0, oo), though in practice, it is more common to employ a finite domain 
such as x G (0,£). In Sturm-Liouville form, we have 

I {' t ) + (" 2 * - t) y = °- (5 ' 189) 

K^fHW*))^ = " 2!,w - (5i9o) 

s » ' 

The Sturm-Liouville operator is 

L « = I (^H) + " 2x )' (5 ' 191) 

In some other cases it is more convenient to take A = fi 2 in which case we get 

p{x) = x, (5.192) 

r{x) = x, (5.193) 

q(x) = --, (5.194) 

x 

and the Sturm-Liouville form and operator are: 

H&( x &)-7))« x) = -" 2,Axh (5 ' 195) 



L - = i (r (* f) - ") ■ < 5 196 > 

x \dx \ ax J x J 
The general solution is 

y(x) = C\J v {nx) + C 2 Y u (iix), if v is an integer, (5.197) 

y(x) = CiJ v {fxx) + C2J r - u (/ix) , if z/ is not an integer, (5.198) 

where J u (fj,x) and Y^/ix) are called the Bessel and Neumann functions of order v. Often 
J u ((ix) is known as a Bessel function of the first kind and Y u (fix) is known as a Bessel 
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function of the second kind. Both J v and Y v are represented by infinite series rather than 
finite series such as the series for Legendre polynomials. 

The Bessel function of the first kind of order v, J u (fix), is represented by 

W'(HS tSmi) - (5 - i99) 

The Neumann function Y v (fix) has a complicated series representation (see Hildebrand). 
The representations for J (fix) and Y (fix) are 

J (^j - 1 (lip - + (2!) 2 + '" + fojp ' (5.2UUJ 

Y (fix) = 1 fin Q/ixj + 7 J J (/za;) (5.201) 

"§(^-H)^-)- 

It can be shown using term by term differentiation that 

dJ v (fix) J„_i(/ia:) - J u+1 (/mx) dY u (fix) Y v _ x (fix) - Y v+1 (fix) onoX 

^^ = /i 2 ' ^x~ = ^ 2 ' (5 - 2 ° 3) 

d d 

— (x v J v (fix)) = fix v 'J v -i (fix) , — (x v Y v (fix)) = fix v Y v _ x (fix) . (5.204) 

dx dx 

The Bessel functions J (fi x), J (fiix), J (fi 2 x), J (fi 3 x) are plotted in Fig. 15.61 Here the 
eigenvalues fi n can be determined from trial and error. The first four are found to be 
fio = 2.40483, fi\ = 5.52008, fi% = 8.65373, and fi 3 = 11.7915. In general, one can say 

lim fi n = mr + 0(l). (5.205) 

The Bessel functions Jq(x), J\(x), J 2 (x), J 3 (x), and J±(x) along with the Neumann functions 
Y Q (x), Yi(x), Y 2 (x), Y 3 (x), and Y±(x) are plotted in Fig. 15.71 (so here fi = 1). 

The orthogonality condition for a domain x G (0, 1), taken here for the case in which the 
eigenvalue is fi n , can be shown to be 

1 1 2 

xJ u (fi n x)J u (fi m x) dx = - (J v+ i(fi n )) 5 nm . (5.206) 

Here we must choose fi n such that J v (fi n ) = 0, which corresponds to a vanishing of the 
function at the outer limit x = 1; see Hildebrand, p. 226. So the orthonormal Bessel 
function is 



/ s V 6XJ u [fl n X) . nr\r'\ 

\ J v+l\fl>n)\ 
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J o(^ x ) 




J o<M J> 2 x) J (M 



0"~1 



Figure 5.6: Bessel functions J (fi x), J (//ix), Jq(h2x), Joi^x). 



Y (x) 



Y ' Y 2 Y 3 Y, 




Figure 5.7: Bessel functions Jo(#), ^i(^), -hi^x), Js(x), J±{x) and Neumann functions Y (x), 
Y 1 (x), Y 2 (x), Y 3 (x), Y 4 (x). 
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5.1.6.2 Third kind 

Hankel 11 ! functions, also known as Bessel functions of the third kind are denned by 

H£\x) = J u (x)+iY u (x), (5.208) 

H^\x) = J u {x)-iY v {x). (5.209) 

5.1.6.3 Modified Bessel functions 

The modified Bessel equation is 

x 2pJ +x dy_ {x 2 + u 2 )y = ^ (5210) 

dx z dx 

the solutions of which are the modified Bessel functions. The modified Bessel function of the 
first kind of order v is 

I„(x) =i~ u J u (ix). (5.211) 

The modified Bessel function of the second kind of order v is 

K v (x) = ^f +1 Hi 1 \ix). (5.212) 

5.1.6.4 Ber and bei functions 

The real and imaginary parts of the solutions of 

x 2pJ_ + x ^__ {p 2 +lx 2 )y = ^ (5213) 

where p is a real constant, are called the ber and bei functions. 

5.2 Fourier series representation of arbitrary functions 

It is often useful, especially when solving partial differential equations, to be able to represent 
an arbitrary function f(x) in the domain x G [xo,Xi] with an appropriately weighted sum of 
orthonormal functions ip n {x): 

oo 

f(x)=J2 a n<Pn(x). (5.214) 

n=0 

We generally truncate the infinite series to a finite number of iV terms so that f(x) is 
approximated by 

TV 

f(x)~J2 a n<Pn(x)- (5.215) 

n=l 



u Hermann Hankel, 1839-1873, German mathematician. 
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We can better label an TV-term approximation of a function as a projection of the function 
from an infinite dimensional space onto an iV-dimensional function space. This will be 
discussed further in Sec. 17.3.2.61 The projection is useful only if the infinite series converges 
so that the error incurred in neglecting terms past N is small relative to the terms included. 
The problem is to determine what the coefficients a n must be. They can be found in the 
following manner. We first assume the expansion exists and multiply both sides by tpk{x)'. 

oo 

(5.216) 
(5.217) 
(5.218) 



J^^nSnk, (5.219) 

■ ■■ + a 00 S ook , (5.220) 



•I'D 



f(x)<pk(x) -- 


= 2 ,®niPn{x)<Pk{x), 
n=0 


f(x)cp k (x) dx -- 


= / ^2a n (p n (x)tp k (x) dx, 

Jx n=0 




PXl 

= ^2/Ci n J ip n (x)tp k (x) dx, 

n=0 Z x ° 



n=0 



So trading k and n 



ao y 


5ok 


+«i S lk + . . 


. + a k 


&kk 




=0 


=0 




= 1 


Otk- 




rxi 






(h, 


j 


j f(x)(f n (x) 

x 


dx. 





=0 



(5.221) 



(5.222) 



The series is known as a Fourier series. Depending on the expansion functions, the series is 
often specialized as Fourier-sine, Fourier- cosine, Fourier-Legendre, Fourier-Bessel, etc. We 
have inverted Eq. ( 15.214ft to solve for the unknown a n . The inversion was aided greatly 
by the fact that the basis functions were orthonormal. For non-orthonormal, as well as 
non- orthogonal bases, more general techniques exist for the determination of a n . 



I 

Example 5.2 

Represent 

f(x)=x 2 , on a;e[0,3], (5.223) 

with a series of 

• trigonometric functions, 

• Legendre polynomials, 

• Chebyshev polynomials, and 

• Bessel functions. 

ICC BY-NC-THJ} 29 July 2012, Sen & Powers. 



5.2. FOURIER SERIES REPRESENTATION OF ARBITRARY FUNCTIONS 171 



Trigonometric Series 

For the trigonometric series, let's try a Fourier sine series. The orthonormal functions in this case 
are, from Eq. (|5.54l) . 

/2 / 717TX \ 

-sin (—J. (5.224) 

The coefficients from Eq. (|5.222[) are thus 

2 / /2 /mrx 



v 3 V 3 



^t-l Vo sin ( — ) ) dx > ( 5 - 225 ) 



>p n (x) 



(5.226) 

(5.227) 
(5.228) 
(5.229) 
(5.230) 
(5.231) 

Note that the magnitude of the coefficient on the orthonormal function, a„, decreases as n increases. 
From this, one can loosely infer that the higher frequency modes contain less "energy." 

2/. . /"kx\ . . / 2ttx \ 



a = 


= o, 


Oil = 


= 4.17328, 


cti - 


= -3.50864 


a 3 = 


= 2.23376, 


C(4 = 


= -1.75432 


«5 = 


= 1.3807. 



f(x) = J- (4.17328 sin f— J -3. 50864 sin ( — ] (5.232) 

(»7rT* \ I 47TT* \ I iTTT* \ \ 

-— 1 - 1.75432 sin ( -— J + 1.3807 sin ( — - j +... j . (5.233) 

The function f(x) = x 2 and five terms of the approximation are plotted in Fig. 15.81 

Legendre polynomials 

Next, let's try the Legendre polynomials. The Legendre polynomials are orthogonal on a; G [—1,1], 
and we have x G [0, 3], so let's define 

(5.234) 

(5.235) 

so that the domain x G [0,3] maps into x G [—1,1]. So, expanding x 2 on the domain x G [0,3] is 
equivalent to expanding 

I) (x + l) 2 = ^(x+l) 2 , ze[-l,l]. (5.236) 



X 


2 

= —X 

3 


-1, 


x = 


— IX 

2 y 


+ 1), 



Now from Eq. (|5.84[) . 



n + -P n (x). (5.237) 

ICC BY-NC-TW} 29 July 2012, Sen & Powers. 



172 



CHAPTER 5. ORTHOGONAL FUNCTIONS AND FOURIER SERIES 



f (X) 



Fourier-sine series 
(five terms) 




Figure 5.8: Five term Fourier-sine series approximation to fix) = x 2 . 



So from Eq. (|5~222|) 



9 

iV4 



(x + iy 



n + y P n{x) I dx. 



/(.; 



Vn(i) 



Evaluating, we get 



a = 


= 3V2 = 


= 4.24264, 


«i = 


■VI 


= 3.67423, 


a 2 = 


3 

n/Io 


= 0.948683 


a 3 = 


= o, 





0, 



n > 3. 



(5.238) 



(5.239) 
(5.240) 

(5.241) 

(5.242) 

(5.243) 
(5.244) 

Once again, the fact the ao > o.\ > a 2 indicates the bulk of the "energy" is contained in the lower 
frequency modes. Carrying out the multiplication and returning to x space gives the finite series, which 
can be expressed in a variety of forms: 

x 2 = a ifo{x) +a 1 ipi{x) +a 2 <p 2 ix), 

13 I /3„ (2 3 



(5.245) 



3\/2 



'Mr- 1 



+3 



Hr-> 



10 



\ p A\ x - 



=Vo(S) 



=vi(£) 



=Va(5) 



3^o ( -a; 



9 p / 2 
—Pi -x 

2 \3 



3 p 2 
—Pi -x 

2 V 3 



9 /2 
3(D + 5 r -l 



1 3 /2 



h - \ -x 

2 2 V 3 



(5.247) 



(5.248) 
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(5.250) 



,2 



Thus, the Fourier-Legendre representation is exact over the entire domain. This is because the function 
which is being expanded has the same general functional form as the Legendre polynomials; both are 
polynomials. 

Chebyshev polynomials 

Let's now try the Chebyshev polynomials. These are orthogonal on the same domain as the Leg- 
endre polynomials, so let's use the same transformation as before. Now from Eq. (J5.113I) 



—L=T (x), (5.251) 

7TV 1 — X 



tp n (x) J— 7 =L==T n (a), n>0. (5.252) 

7rvl — x 



So 



a = 7(5 + I) 2 J—, =L=T (£) dx, (5.253) 

-1 4 t \j 7rVl — ar 



tp {x) 



1 9 2 

(x + 1) 2 J— —T n {x) dx. (5.254) 



1 4 / y TTy/l — X 2 



Evaluating, we get 



Vn(£) 



a = 4.2587, (5.255) 

ai = 3.4415, (5.256) 

a 2 = -0.28679, (5.257) 

a 3 = -1.1472, (5.258) 



With this representation, we see that |a3| > |ck2| , so it is not yet clear that the "energy" is concentrated 
in the high frequency modes. Consideration of more terms would verify that in fact it is the case that 
the "energy " of high frequency modes is decaying; in fact a.4 = —0.683, 0*5 = —0.441, a$ = —0.328, 
a 7 = -0.254. So 



f(x) = x 



2 



\njl-(l x -l)' 



4 2587 I ' 1 \ I ' 1 

To \-x - 1 + 3.4415 Ti ( -x - 1 I (5.259) 



n v^ \3 / V3 



-0.28679 T 2 ( -x - 1 J - 1.1472 T 3 ( -x - 1 +...). 



The function f(x) = x and four terms of the approximation are plotted in Fig. 15.91 
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f (x) 
10 



Fourier-Chebyshev series 
(four terms) 




.5 1 1.5 2 2.5 3 



Figure 5.9: Four term Fourier-Chebyshev series approximation to f(x) = x 2 



Bessel functions 

Now let's expand in terms of Bessel functions. The Bessel functions have been defined such that 
they are orthogonal on a domain between zero and unity when the eigenvalues are the zeros of the 
Bessel function. To achieve this we adopt the transformation (and inverse): 



x 
x = — , 
3 



x = 3x. 
With this transformation our domain transforms as follows: 

xe [0,3] — >x e [0,1]. 

So in the transformed space, we seek an expansion 



95? = }, tt n J v {Vr, 



(5.261) 



(5.262) 



(5.263) 



/(£) 



Let's choose to expand on Jo, so we take 

9x 2 = y*^a n Jo(n n x). 



(5.264) 



n=Q 



Now, the eigenvalues fj, n are such that Jo(Mn) = 0. We find using trial and error methods that solutions 
for all the zeros can be found: 



(5.265) 
(5.266) 
(5.267) 



Mo = 


= 2.4048 


Mi = 


= 5.5201 


M2 = 


= 8.6537 
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f(x) 



Fourier-Bessel Series 
(ten terms) 




Figure 5.10: Ten term Fourier-Bessel series approximation to f(x) = x 2 . 



Similar to the other functions, we could expand in terms of the orthonormalized Bessel functions, ip n (x). 
Instead, for variety, let's directly operate on Eq. (|5.264|) to determine the values for a n . 



9x 2 xJ ([ikx) = y^ a n xJo(n n x)Jo(fi k x), 

n=0 
1 /-l °° 

9x 3 J (/ifci) dx = / y^^a n xJ (fi n x)J (iJ,kx) dx, 

J „=0 

1 
3 



9 / x Jo(fikx) dx 



ii 



J^a„ / xJ (fi n x)J (nkx) dx, 

n=0 J ° 



So replacing k by n and dividing we get 



a*, / xJ (nkx)Jo(fJ.kx) dx. 



9/ x 3 J (fi n x) dx 



J xJ (ii n x)J (n n x) dx 



Evaluating the first three terms we get 



a = 4.446, 
a x = -8.325, 
a 2 = 7.253, 



(5.268) 
(5.269) 
(5.270) 
(5.271) 

(5.272) 



(5.273) 

(5.274) 
(5.275) 



Because the basis functions are not normalized, it is difficult to infer how the amplitude is decaying by 
looking at a n alone. The function f(x) = x 2 and ten terms of the Fourier-Bessel series approximation 
are plotted in Fig. 15.101 The Fourier-Bessel approximation is 

f(x) = x 2 = 4.446 J (2.4048 (-)) - 8.325 J f 5.5201 (-)) + 7.253 J (8.6537 (-))+.... (5.276) 



\CC BY-NC-TW} 29 July 2012, Sen & Powers. 



176 CHAPTER 5. ORTHOGONAL FUNCTIONS AND FOURIER SERIES 



Note that other Fourier-Bessel expansions exist. Also note that even though the Bessel function does 
not match the function itself at either boundary point, that the series still appears to be converging. 

I 



Problems 

1. Show that oscillatory solutions of the delay equation 

dx 

— (t)+x(t)+bx(t- 1) = 0, 

are possible only when b = 2.2617. Find the frequency. 

2. Show that x a J u (bx c ) is a solution of 

// 2a - 1 , / 2 2 2r -2 a 2 -v 2 c 2 \ 
y" y' + b 2 c 2 x 2c 2 + - )y = 0. 

X \ X A J 

Hence solve in terms of Bessel functions: 

(a) £i + k 2 xy = 0, 

(b) ^L+x±y = 0. 

3. Laguerre's differential equation is 

xy" + (1 - x)y' + Xy = 0. 

Show that when A = n, a nonnegative integer, there is a polynomial solution L n (x) (called a Laguerre 
polynomial) of degree n with coefficient of x n equal to 1. Determine Lq through L4. 

4. Consider the function y(x) = x 2 — 2x + 1 defined for x G [0,4]. Find eight term expansions in terms 
of a) Fourier-Sine, b) Fourier-Legendre, c) Fourier-Hermite (physicists'), d) Fourier-Bessel series and 
plot your results on a single graph. 

5. Consider the function y{x) = 0, x G [0,1), y(x) = 2x — 2, x G [1,2]. Find an eight term Fourier- 
Legendre expansion of this function. Plot the function and the eight term expansion for x G [0, 2]. 

6. Consider the function y(x) = 2x,x G [0,6]. Find an eight term a) Fourier-Chebyshev and b) Fourier- 
sine expansion of this function. Plot the function and the eight term expansions for x G [0, 6]. Which 
expansion minimizes the error in representation of the function? 

7. Consider the function y(x) = cos 2 (a; 2 ). Find an eight term a) Fourier-Laguerre, (x G [0,oo)), and b) 
Fourier-sine (x G [0, 10]) expansion of this function. Plot the function and the eight term expansions 
for x G [0, 10]. Which expansion minimizes the error in representation of the function? 
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Chapter 6 

Vectors and tensors 



see Kaplan, Chapters 3, 4, 5, 

see Lopez, Chapters 17-23, 

see Arts, 

see Borisenko and Tarapov, 

see McConnell, 

see Schey, 

see Riley, Hobson, and Bence, Chapters 6, 8, 19. 

This chapter will outline many topics considered in traditional vector calculus and include 
an introduction to differential geometry. 

6.1 Cartesian index notation 

Here we will consider what is known as Cartesian index notation as a way to represent vectors 
and tensors. In contrast to Sec. 11.31 which considered general coordinate transformations, 
when we restrict our transformations to rotations about the origin, many simplifications 
result. For such transformations, the distinction between contravariance and covariance 
disappears, as does the necessity for Christoffel symbols, and also the need for an "upstairs- 
downstairs" index notation. 

Many vector relations can be written in a compact form by using Cartesian index nota- 
tion. Let Xi,X2,X3 represent the three coordinate directions and ei,e2,e3 the unit vectors 
in those directions. Then a vector u may be written as 




3 

=i 



u 1 e 1 + u 2 e 2 + u 3 e 3 = ^ u^e; = u^ = m, (6.1' 



where ui, u 2 , and u 3 are the three Cartesian components of u. Note that we do not need to 
use the summation sign every time if we use the Einstein convention to sum from 1 to 3 if 

177 
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an index is repeated. The single free index on the right side of Eq. ( 16. ip indicates that an ej 
is assumed. 

Two additional symbols are needed for later use. They are the Kronecker delta, as 
specialized from Eq. (11.63p . 

' 0, iii^j, 
1, if i = j. 

and the alternating symbol (or Levi-Civital^| symbol) 



««= ,' VZ J : (6.2) 



1, if indices are in cyclical order 1,2,3,1,2,- • •, 
tijk = ^ — 1, if indices are not in cyclical order, (6-3) 

0, if two or more indices are the same. 

The identity 

^ijk^lmn OilOjmOkn i Oi m Oj n Okl i OinOjlO^m OilOj-nOkm 0i m Oj;Ofc n V in® jm® kl ■, V^'^J 

relates the two. The following identities are also easily shown: 

(6.5) 

(6.6) 

(6.7) 

SjmSkh (6.8) 

(6.9) 
(6.10) 
(6.11) 
(6.12) 
(6.13) 
(6.14) 

Regarding index notation: 

• a repeated index indicates summation on that index, 

• a non-repeated index is known as a free index, 

• the number of free indices give the order of the tensor: 

— u, uv, UiViW, Uu, UijVij, zeroth order tensor-scalar, 

— Ui, UiVij, first order tensor- vector, 

— Uij, UijVjk, UiVj, second order tensor, 



Sii 


= 


3. 


5ij 


= 


Oji, 


OijOjk 


= 


fiik> 


^ijk^ilm 


= 


SjlSkm 


^ijk^ljk 


= 


25j/, 


^ijk^ijk 


= 


6, 


tijk 


= 


t-ikji 


^ijk 


= 


tjiki 


^ijk 


= 


tkjit 


t-ijk tkij 


= 


tjki- 



lr [ullio Levi-Civita, 1883-1941, Italian mathematician. 
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— Uijk, UiVjWk, UijVkmWm, third order tensor, 

— Uijki, UijVki, fourth order tensor. 

• indices cannot be repeated more than once: 

- uuk, Uij, Uujj, ViUjk are proper. 

- UiViW,, u Uij , u.jVu are improper! 

• Cartesian components commute: UijViWkim = ViWki m Uij, 

• Cartesian indices do not commute: Uijki ^ %Hfc- 



I 

Example 6.1 

Let us consider, using generalized coordinates described earlier in Sec. 11.31 a trivial identity trans- 
formation from the Cartesian P coordinates to the transformed coordinates x 1 : 

x l =S}, x 2 =i 2 , x 3 =f. (6.15) 



Here, we are returning to the more general "upstairs-downstairs" index notation of Sec. 11.31 Recalling 
Eq. (|1.78j) . the Jacobian of the transformation is 

BP f 1 ° °\ 

From Eq. (|1 .85|) . the metric tensor then is 

9ij = G = J T - J = 1-1 = 1 = %. (6.17) 

Then we find by the transformation rules that for this transformation, the covariant and contravariant 
representations of a general vector u are one and the same: 

u i — 9ij u3 = SijU J = SjV, 3 = u % . (6.18) 

Consequently, for Cartesian vectors, there is no need to use a notation which distinguishes covariant 
and contravariant representations. We will hereafter write all Cartesian vectors with only a subscript 
notation. 



6.2 Cartesian tensors 



6.2.1 Direction cosines 



Consider the alias transformation of the (a^,^) Cartesian coordinate system by rotation of 
each coordinate axes by angle a to the rotated Cartesian coordinate system x\, ~x~2 as sketched 
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X^ = X^ COS a + X* 2 COS (3 




Figure 6.1: Rotation of axes in a two-dimensional Cartesian system. 



in Fig. 16.11 Relative to our earlier notation for general non-Cartesian systems, Sec. 11.31 in 
this chapter, x plays the role of the earlier £, and x plays the role of the earlier x. We define 
the angle between the X\ and X\ axes as a: 



a = [xi, Xi\. 
With (3 = it/2 — a, the angle between the X\ and £2 axes is 

/3 = [x 2 ,xi]. 



(6.19) 



(6.20) 



The point P can be represented in both coordinate systems. In the unrotated system, P is 
represented by the coordinates: 



r : [x 1 , x 2 )- 
In the rotated coordinate system, P is represented by 

r : [x-^, x 2 )- 

Trigonometry shows us that 

x\ = x\ cos a + x* 2 cos j3, 
x\ = x\ cos[xi,afi] + x* 2 cos[x2,xi]. 

Dropping the stars, and extending to three dimensions, we find that 

X\ = X\ COs[xi,Xi] + X2COsb2,Xi] + X3COs[x3,Xi]. 



(6.21) 
(6.22) 

(6.23) 
(6.24) 

(6.25) 
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Extending to expressions for x-i and £3 and writing in matrix form, we get 

cos[xi,5?i] cos[xi,x 2 ] cos [a; 1,3:3] 
(3T1 x 2 x 3 ) = (x ± x 2 x 3 ) ■ \ cos[x 2 ,x"i] cos[x 2 ,x 2 ] cos[x 2 ,x 3 ] (6.26) 

""T" \cos[x 3 ,Xi] cos [0:3,3:2] cos[x 3 ,x 3 ] 



"3- 



=/«=Q 



Using the notation 
Eq. (I6.26P is written as 



1ij = cos[xi,Xj\, (6.27) 



'in tu in 

(aTi x 2 x 3 ) = (xi x 2 x 3 )- ( 4i ^22 ^23 I • (6.28) 

^31 ^32 ^33 



=Q 

Here £{j are known as the direction cosines. Expanding the first term we find 

xi = xi4i + x 2 4i + x 3 £ 31 . (6.29) 



More generally, we have 



x 



j 



x\l\j + x 2 ^ 2 j + x 3 £ 3j , (6.30) 

3 
^Xiiij, (6.31) 



t=i 



= xdij. (6.32) 

Here we have employed Einstein's convention that repeated indices implies a summation over 
that index. 

What amounts to the law of cosines, 



J ij kj 



8 ik , (6.33) 



can easily be proven by direct substitution. Direction cosine matrices applied to geometric 
entities such as polygons have the property of being volume- and orientation-preserving 
because det^y = 1. General volume-preserving transformations have determinant of ±1. 
For right-handed coordinate systems, transformations which have positive determinants are 
orientation-preserving, and those which have negative determinants are orientation-reversing. 
Transformations which are volume- preserving but orientation-reversing have determinant of 
— 1, and involve a reflection. 



I 

Example 6.2 

Show for the two-dimensional system described in Fig. 16. II that £ij£kj = 5ik holds. 
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Expanding for the two-dimensional system, we get 

lnlki+latk2 = 8i k . (6.34) 

First, take i = 1, k = 1. We get then 

hihi + h2iu = S n = 1, (6.35) 

cosacosa + cos(a + tt/2) cos(a + n/2) = 1, (6.36) 

cos a cos a + (— sin(a))(— sin(a)) = 1, (6.37) 

cos 2 a + sin a = 1. (6.38) 

This is obviously true. Next, take i = 1, k = 2. We get then 

*ll4jl+*12*22 = <*12 = 0, (6.39) 

cosacos(7r/2 — a) + cos(a + n/2) cos(a) = 0, (6.40) 

cos a sin a — sin a cos a = 0. (6-41) 

This is obviously true. Next, take i = 2, k = 1. We get then 

4i4i + 4a4 2 = fei = 0, (6.42) 

cos(ir/2 — a) cosa + cosacos(ir/2 + a) = 0, (6.43) 

sin a cos a + cos a(— sin a) = 0. (6.44) 

This is obviously true. Next, take % = 2, k = 2. We get then 

*2i4ii+ ^22 = &s = 1, (6-45) 

cos(-7r/2 — a) cos(7r/2 — a) + cos a cos a = 1, (6.46) 

sin a sin a + cos a cos a = 1. (6-47) 



Again, this is obviously true. 



J 



Using the law of cosines, Eq. (16.330 . we can easily find the inverse transformation back 
to the unprimed coordinates via the following operations. First operate on Eq. (16.320 with 

^ij^kj^it ^0.4tyj 

= S ik Xi, (6.50) 

= x k , (6.51) 

i ?' "^ ?' — *"k ^i I u.u^ I 

iX- ^ — \"ijjLj. 1 U.uO I 

Note that the Jacobian matrix of the transformation is J = dxi/dxj = £ij. It can be shown 
that the metric tensor is G = J T • J = ijiiki = Sjk = I, so g = 1, and the transformation 
is volume-preserving. Moreover, since J T • J = I, we see that J T = J -1 . As such, it is 
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precisely the type of matrix for which the gradient takes on the same form in original and 
transformed coordinates, as presented in the discussion surrounding Eq. (JL95]). As will be 
discussed in detail in Sec. 18.61 matrices which have these properties are known as orthogonal 
are often denoted by Q. So for this class of transformations, J = Q = dXi/dTj = l^. Note 
that Q T • Q = I and that Q T = Q _1 . The matrix Q is a rotation matrix when its elements 
are composed of the direction cosines £y . Note then that Q T = £ji. For a coordinate system 
which obeys the right-hand rule, we require det Q = 1 so that it is also orientation-preserving. 



I 

Example 6.3 

Consider the previous two-dimensional example of a matrix which rotates a vector through an angle 
a using matrix methods. 

We have 

dxi „ ^ ( cos a cos (a + 

We get the rotated coordinates via Eq. (|6.26j) 

x T = x T Q, 



J = ^ = la ■■= Q = ( 7:~ x v- ■ ay j = f — — ) . ( 6 . 5 4) 

oxj \ cos (? — a ) cos a / Vsrna cos a 



cos a — sin a 



,_ _ , , . / cos a — sma 

(xi x 2 ) = (Xi x 2 )-[ . 

\ sm a cos a 

= ( x\ cos a + X2 sin a — x\ sin a + x-i cos a ) . 
X\ \ ( Xi cos a + X2 sin a 

X2 J \ — xi sin a + x 2 cos a 

We can also rearrange to say 

x = Q T ■ x, 
Q x = Q Q T x, 

i 
Q • x = I x, 
x = Q • x. 

The law of cosines holds because 

. t I cos a — sm a \ I cos a sm a 



Q Q J 



sm a cos a / \ — sm a cos a 

cos 2 a + sin a 

sin a + cos 2 a 

1 
1 

= <%■ 



6.55) 
6.56) 
6.57) 
6.58) 



6.59) 
6.60) 

6.61) 
6.62) 



6.63) 

6.64) 

6.65) 
6.66) 



Consider the determinant of Q: 

det Q = cos a — (— sin a) = cos a + sin a = 1. (6.67) 

Thus, the transformation is volume- and orientation-preserving; hence, it is a rotation. The rotation is 
through an angle a. 

\CC BY-NC-TW} 29 July 2012, Sen & Powers. 



184 CHAPTER 6. VECTORS AND TENSORS 



I 

Example 6.4 

Consider the so-called reflection matrix in two dimensions: 

/cos a sin a \ 
y sm a — cos a J 

Note the reflection matrix is obtained by multiplying the second column of the rotation matrix of 
Eq. ((634)) by -1. We see that 

_ _t /cos a sin a \ / cos a sin a \ ,„ „„. 

Q • Q = . •■ ' ( 6 - 69 

V sm a — cos a / \ sm a — cos a 



9 . 9 

cos a + sm a 



2 „ j. ,_2 „ ) > (6-70) 



sin a + cos a 

J J)=I = *«- (6-71) 

The determinant of the reflection matrix is 

det Q = — cos a — sin a = — 1. (6.72) 

Thus, the transformation is volume-preserving, but not orientation-preserving. One can show by con- 
sidering its action on vectors x is that it reflects them about a line passing through the origin inclined 
at an angle of a/2 to the horizontal. 

I 



6.2.1.1 Scalars 

An entity cf> is a scalar if it is invariant under a rotation of coordinate axes. 

6.2.1.2 Vectors 

A set of three scalars (v 1, t>2, v%) T is denned as a vector if under a rotation of coordinate axes, 
the triple also transforms according to 

v j = vdij, v T = v T • Q. (6.73) 

We could also transpose both sides and have 

v = Q T v. (6.74) 

A vector associates a scalar with a chosen direction in space by an expression which is linear 
in the direction cosines of the chosen direction. 
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I 

Example 6.5 

Returning to generalized coordinate notation, show the equivalence between covariant and con- 
travariant representations for pure rotations of a vector v. 

Consider then a transformation from a Cartesian space f to a transformed space x % via a pure 
rotation: 

e = ey. (6.75) 

Here £ l , is simply a matrix of direction cosines as we have previously defined; we employ the upstairs- 
downstairs index notation for consistency. The Jacobian is 



dx 3 
From Eq. (|1.85[) . the metric tensor is 



Pj. (6.76) 



dP dP 

Here we have employed the law of cosines, which is easily extensible to the "upstairs-downstairs" 
notation. 

So a vector v has the same covariant and contravariant components since 

Vi = 9tjV 3 = 5 zj v 3 = 5)v J = v\ (6.78) 

Note the vector itself has components that do transform under rotation: 

v* = e)V j . (6.79) 

Here V 3 is the contravariant representation of the vector v in the unrotated coordinate system. One 
could also show that Vj = V 3 , as always for a Cartesian system. 

I 



6.2.1.3 Tensors 

A set of nine scalars is denned as a second order tensor if under a rotation of coordinate 
axes, they transform as 

Ta = ikdijTki, T = Q T T Q. (6.80) 

A tensor associates a vector with each direction in space by an expression that is linear in 
the direction cosines of the chosen transformation. It will be seen that 

• the first subscript gives associated direction (or face; hence first-face), and 

• the second subscript gives the vector components for that face. 

Graphically, one can use the sketch in Fig. !6.2l to visualize a second order tensor. In Fig. l6.2[ 
qW, q( 2 \ and q^ 3 \ are the vectors associated with the 1, 2, and 3 faces, respectively. 
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Figure 6.2: Tensor visualization. 

6.2.2 Matrix representation 

Tensors can be represented as matrices (but all matrices are not tensors!) 

T 



T 



i.i 



I Tu T12 1 13 

i T 21 T 22 T 23 

\ ^31 ^32 T33 



vector associated with 1 direction, 
vector associated with 2 direction, 
vector associated with 3 direction. 



(6.8i; 



A simple way to choose a vector qj associated with a plane of arbitrary orientation is to 
form the inner product of the tensor Tij and the unit normal associated with the plane nf 



( li 



1T>i-L ijy 



q 



T 



n T T. 



(6.82) 



Here r^ has components which are the direction cosines of the chosen direction. For example 
to determine the vector associated with face 2, we choose 



II; 




(6.83) 



Thus, in Gibbs notation we have 



n T T 



Tu T12 T 13 

j T 21 T 22 T 23 

T31 T 32 T33 



[T 2 i , T 22 , T 



2.°, I 



(6.84) 
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In Einstein notation, we arrive at the same conclusion via 

rtiTij = niT lj + n 2 T 2 j + n 3 T 3j , (6.85) 

= (0)T y + (l)T 2i + (0)T 3i , (6.86) 

= (T 21 ,T 22 ,T 23 ). (6.87) 

6.2.3 Transpose of a tensor, symmetric and ant i- symmetric ten- 
sors 

The transpose T? of a tensor T^ is found by trading elements across the diagonal 

T? = Tfi, (6.88) 

so 

(Tu T 2 i T 3 i \ 

T 12 T 22 T 32 . (6.89) 

T13 T 23 T 33 1 

A tensor is symmetric if it is equal to its transpose, i.e. 

T i: j = T jU T = T T , if symmetric. (6.90) 

A tensor is anti-symmetric if it is equal to the additive inverse of its transpose, i. e. 

Tij = —Tji, T = — T T , if anti-symmetric. (6.91) 

A tensor is asymmetric if it is neither symmetric nor anti-symmetric. 

The tensor inner product of a symmetric tensor Sy and anti-symmetric tensor A^ can 
be shown to be 0: 

SijMj = 0, S : A = 0. (6.92) 

Here the ":" notation indicates a tensor inner product. 



I 

Example 6.6 

Show SijAij = for a two-dimensional space. 

Take a general symmetric tensor to be 

a b 



Take a general anti-symmetric tensor to be 

A - = {- d o)" (6 - 94) 

So 

SijAij = SuAu + S 12 A 12 + S 21 A 21 + S 22 A 22 , (6.95) 

= a{0) + bd-bd + c{0), (6.96) 

= 0. (6.97) 
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J 



An arbitrary tensor can be represented as the sum of a symmetric and anti-symmetric 
tensor: 



J-ij y-l-ij + „J-ij + y-Lji y-'-jii [b.yo) 



(T ij + T ji ) + -(T ij -T ji ). (6.99) 



2 V ' J J " • 2 



So with 



we arrive at 





= T (i]) = T m 


T (ij) 


1 


T M 


= 2 ^ •?' ~~ W ' 


T ■ = 


%) + %] 



(6.100) 

(6.101) 

(6.102) 



symmetric anti— symmetric 



The first term, T^, is called the symmetric part of T^; the second term, Tuj], is called the 
anti-symmetric part of Ty. 

6.2.4 Dual vector of an anti- symmetric tensor 

As the anti-symmetric part of a three by three tensor has only three independent components, 
we might expect a three-component vector can be associated with this. Let us define the 
dual vector to be 

111 
di — 7; e ijkTjk = — e ijkT(jk) +y e ijkT{jk\- (6.103) 

=0 

For fixed i, tijk is anti-symmetric. So the first term is zero, being for fixed i the tensor inner 
product of an anti-symmetric and symmetric tensor. Thus, 

di = -e ljk T [jk] . (6.104) 

Let us find the inverse. Apply e^ m to both sides of Eq. (I6.103P to get 
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Expanding, we can see that 



T, 



T 



[lm] 



T 



M 



T 



i.i\ 



{dljSmk — SlkS m j)Tjk, 



(Ti 



i in 



2 

T[lm] 7 
^ijk^k- 



T 



ml) 



tijkdk — Cijldl + tij2d,2 + tiftd. 



j}"2«2 



~ij3 u 3 






d 3 


-d 2 


-d 3 





d\ 


d 2 


—d\ 






(6.106) 

(6.107) 

(6.108) 
(6.109) 
(6.110) 
(6.111) 



(6.112) 



The matrix form realized is obvious when one considers that an individual term, such as 
tijidi only has a value when i,j = 2, 3 or i,j = 3,2, and takes on values of ±di in those 
cases. In summary, the general dimension three tensor can be written as 



T 



T 



W) 



^ijk^k- 



(6.113) 



6.2.5 Principal axes and tensor invariants 

Given a tensor T^-, find the associated direction such that the vector components in this 
associated direction are parallel to the direction. So we want 



Tli± ij 



Xrij. 



(6.114) 



This defines an eigenvalue problem; this will be discussed further in Sec. 17.4.41 Linear algebra 
gives us the eigenvalues and associated eigenvectors. 



lni,n 2 ,n 3 ] 



riT = 


= XriiSij, 


(6.115) 


rii(Tij - X5ij) = 


= 0, 


(6.116) 


Tu — X T 12 T 13 \ 






T21 T22 — A T23 


= (0,0,0). 


(6.117) 


T31 T32 T33 — XI 







This is equivalent to n • (T — AI) = or (T — AI) • n = 0. We get non-trivial solutions if 



Tn — A T 12 T 13 

T21 T22 — X T23 

^31 T32 T33 — A 



0. 



(6.118) 



We are actually finding the so-called left eigenvectors of T^. These arise with less frequency 
than the right eigenvectors, which are defined by TijUj = XSijUj. Right and left eigenvalue 
problems are discussed later in Sec. 17.4.41 
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We know from linear algebra that such an equation for a third order matrix gives rise to 
a characteristic polynomial for A of the form 

A 3 -4 1] A 2 + 4 2) A-4 3) =0, (6.119) 

where 4 , 4 , 4 are scalars which are functions of all the scalars T^-. The It's are known 
as the invariants of the tensor T^. The invariants will not change if the coordinate axes are 
rotated; in contrast, the scalar components T^- will change under rotation. The invariants 
can be shown to be given by 

4" = T u = T n +T 22 + T 33 = tvT, (6.120) 

I? = ^(r„r 3 ,-r, J T J ,) = I(( tr Tf- tr( T.T)) = (de t T)( tr T-), (6.121) 

= 2 ( T ( ii ) T (^') + %1%] ~ T (ii) T (iJ)) ' (6.122) 

4 3) = e ijk T u T 2j T 3k = detT. (6.123) 

Here, "tr" denotes the trace. It can also be shown that if A 1 - 1 ), \( 2 \ X^ are the three eigen- 
values, then the invariants can also be expressed as 

4 1 ) = a« + A( 2 > + A( 3 \ (6.124) 

4 2 ) = a^A^ + A^A^ + A^A^, (6.125) 

4 3 ) = \W\M\®. (6.126) 

If Tij is real and symmetric, it can be shown that 

• the eigenvalues are real, 

• eigenvectors corresponding to distinct eigenvalues are real and orthogonal, and 

• the left and right eigenvectors are identical. 

A sketch of a volume element rotated to be aligned with a set of orthogonal principal axes 
is shown in Figure l6~3l 

If the matrix is asymmetric, the eigenvalues could be complex, and the eigenvectors are 
not orthogonal. It is often most physically relevant to decompose a tensor into symmetric and 
anti-symmetric parts and find the orthogonal basis vectors and real eigenvalues associated 
with the symmetric part and the dual vector associated with the anti- symmetric part. 

In continuum mechanics, 

• the symmetric part of a tensor can be associated with deformation along principal 
axes, and 

• the anti-symmetric part of a tensor can be associated with rotation of an element. 
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l *3 




rotate 




Figure 6.3: Sketch depicting rotation of volume element to be aligned with principal axes. 
Tensor T^- must be symmetric to guarantee existence of orthogonal principal directions. 



I 

Example 6. 7 

Decompose the tensor given here into a combination of orthogonal basis vectors and a dual vector. 



T 



1 1 -2' 
3 2-3 

-4 1 1 



(6.127) 



First 



T (y) 


= fa 


T M 


= 5< r « 


irst, get the dual vector df. 




di = 2 eijkT tik]' 




1 


i, 



1 JV 



L JIJ 



2 -3' 

2 -1 

-1 1 

-1 1 

-2 

2 



-eijkT [jk] 



2 (ei23T [23] +ei3 2 T [32] ) = _((l)(-2) + (-l)(2)) 



\t2 jk T m = i(e 2 i3T [13 ] + caaiTpu) = |((-1)(1) + (1)(-1)) 



d 3 

di = (-2,-1,-iy 

Note that Eq. (|6.112|) is satisfied 



l^JkT Uk] = ^(C812T [12] + C82lT [21] ) = |((1)(-1) + (-1)(1)) 



(6.128) 
(6.129) 





(6.130) 


2. 


(6.131) 


1. 


(6.132) 


1, 


(6.133) 




(6.134) 
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-1 


-3 


-1 
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Now find the eigenvalues and eigenvectors for the symmetric part. 

0. (6.135) 

-3 -1 1-A 

We get the characteristic polynomial, 

A 3 -4A 2 -9A + 9 = 0. (6.136) 

The eigenvalue and associated normalized eigenvector for each root is 

AW = 5.36488, nf ] = (-0.630537, -0.540358, 0.557168) T , (6.137) 

A (2) = -2.14644, nf ] = (-0.740094,0.202303, -0.641353) T , (6.138) 

A< 3 > = 0.781562, nf ] = (-0.233844, 0.816754, 0.527476) T (6.139) 

It is easily verified that each eigenvector is orthogonal. When the coordinates are transformed to be 
aligned with the principal axes, the magnitude of the vector associated with each face is the eigenvalue; 
this vector points in the same direction of the unit normal associated with the face. 

I 



I 

Example 6.8 

For a given tensor, which we will take to be symmetric, though the theory applies to non-symmetric 
tensors as well, 

/I 2 4\ 
Tij = T = 2 3 -1 , (6.140) 

\4 "I 1 / 

find the three basic tensor invariants, Ij, , iL , and Ij, , and show they are truly invariant when the 
tensor is subjected to a rotation with direction cosine matrix of 

( ±- */2 JL \ 

/ Ve V 3 Ve \ 

^ = Q= Ms -7s A ■ (041) 

v^ ° -*/ 

Calculation shows that det Q = 1, and Q ■ Q T = I, so the matrix Q is volume- and orientation- 
preserving, and thus a rotation matrix. As an aside, the construction of an orthogonal matrix, such as 
our Q is non-trivial. One method of construction involves determining a set of orthogonal vectors via 
a process to be described later, see Sec. 17.3.2.51 

The eigenvalues of T, which are the principal values, are easily calculated to be 

A (1) = 5.28675, A (2) = -3.67956, A (3) = 3.39281. (6.142) 

The three invariants of ly are 

4 X) = tr(T) = tr|2 3 -l| =1 + 3 + 1 = 5, (6.143) 
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r (2) 



-((tr(T)) 2 -tr(T-T)) 





r(3) 
1 T 

Now when we rotate the tensor T, we get a transformed tensor given by 

4 



Tr rYi 



Q J T Q 



i 

n/3 

1 

V3 

1 

4.10238 2.52239 
2.52239 -0.218951 
1.60948 -2.91291 



i 

.'! 
1 




2 
3 

-1 




1.60948 \ 
-2.91291 
1.11657 J 




1 


2 


1 


vlT 


V 3 


v/6 


l 


1 


1 


s/Z 


V3 


v^ 


1 


o 


1 


v'2 




V2 



(6.144) 
(6.145) 



(6.146) 



(6.147) 



We then seek the tensor invariants of T. Leaving out some of the details, which are the same as those 
for calculating the invariants of T, we find the invariants indeed are invariant: 



r (D 



r(- ? ) 



r(3) 



4.10238-0.218951 
I(5 2 -53) = -14, 
-66. 



1.11657: 



Finally, we verify that the tensor invariants are indeed related to the principal values (the eii 
of the tensor) as follows 



r(2) 



r (3) 



A (i) + A (2) + A (3) = 5.28675 - 3.67956 + 3.39281 = 5, 

A (l) A (2) +A (2) A (3) +A (3) A (l) i 

(5.28675)(-3.67956) + (-3.67956)(3. 39281) + (3.39281)(5.28675) 
A (i) A (2) A (3) = (5.28675)(-3.67956)(3.39281) = -66. 



-14, 



(6.148) 

(6.149) 

(6.150) 

;envalues 

(6.151) 

(6.152) 
(6.153) 



J 



6.3 Algebra of vectors 



Here we will primarily use bold letters for vectors, such as in u. At times we will use the 
notation Ui to represent a vector. 
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6.3.1 Definition and properties 

Null vector: A vector with zero components. 

Multiplication by a scalar a: «u = au±ei + au 2 e 2 + au^e^ = aui, 

Sum of vectors: u + v = (u x + t>i)ei + (u 2 + i> 2 )e 2 + (u 3 + v 3 )e 3 = (u { + v t ), 

Magnitude, length, or norm of a vector: ||u|| 2 = \/u\ + u 2 -\- u\ = -^/u^Ui, 

Triangle inequality: ||u + v||2< 1 1 u. 1 1 2 — I— | |~v 1 1 2- 

Here the subscript 2 in || • || 2 indicates we are considering a Euclidean norm. In many 
sources in the literature this subscript is omitted, and the norm is understood to be the 
Euclidean norm. In a more general sense, we can still retain the property of a norm for a 
more general p-norm for a three-dimensional vector: 

IHI P =(k| p +M P +N P ) 1/p , l<P<oo. (6.154) 

For example the 1-norm of a vector is the sum of the absolute values of its components: 

||u||i = (|ui| + \u 2 \ + \u 3 \) . (6.155) 

The oo-norm selects the largest component: 

I |u| |oo ^ Km (\ui\ p + \u 2 \ p + |%| p ) = maxj = i i2)3 |iii|. (6.156) 

p — ^OO 

6.3.2 Scalar product (dot product, inner product) 

The scalar product of u and v is defined for vectors with real components as 

(vA 

<U, V> = U T • V = («i U 2 «3)* I v 2 = UxVi + U 2 V 2 + U 3 V 3 = UiVi. (6.157) 

Note that the term nfli is a scalar, which explains the nomenclature "scalar product." 
The vectors u and v are said to be orthogonal if u T • v = 0. Also 

( Ul \ 

<u, u> = u T • u = («! u 2 u 3 ) ■ I u 2 j = u\ + u\ + u\ = UiUi = (||u|| 2 ) 2 . (6.158) 

\u 3 ) 

We will consider important modifications for vectors with complex components later in 
Sec. 17.3.21 In the same section, we will consider the generalized notion of an inner product, 
denoted here by <.,.>. 
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6.3.3 Cross product 

The cross product of u and v is denned as 



U X V 



ei 


e 2 


e 3 


"i 


"2 


U3 


< ! i 


< ! 2 


V3 



c -ijkUjV k . (6.159) 



e ijk Ujau k , 






^^ijk^j **fc) 






afenUxUx + 


e il2 u 1 u 2 + e 


J13W1W3 


+e l2 iu 2 ui + 


i i22 u 2 u 2 + e 


23U 2 U 3 


+Q31M3M1 + 


^32«3«2 + & 


33U3U3) 


0, for % - 


= 1,2,3, 




- —£i2li Cjl3 = 


= -e i3 i, and 


£i23 = 



Note the cross product of two vectors is a vector. 

Property: u x au = 0. Let's use Cartesian index notation to prove this 

u x au = eijkUjauk, (6.160) 

(6.161) 
(6.162) 
(6.163) 
(6.164) 
(6.165) 

since e iU = e i22 = e i33 = and e il2 = -e m , e m = -e i3l , and e i23 = -e i32 . 

6.3.4 Scalar triple product 

The scalar triple product of three vectors u, v, and w is defined by 

[u, v, w] = u T -(vxw), (6.166) 

= tijkU.VjWk. (6.167) 

The scalar triple product is a scalar. Geometrically, it represents the volume of the paral- 
lelepiped with edges parallel to the three vectors. 

6.3.5 Identities 

[u, v,w] = — [u, w,v], (6.168) 

u x (v x w) = (u T • w)v — (u T • v)w, (6.169) 

(u x v) x (w x x) = [u, w,x]v — [v, w,x]u, (6.170) 

(uxv) T -(wxx) = (u T - w)(v T -x) - (u T -x)(v T - w). (6.171) 



I 

Example 6.9 

Prove Eq. (|6.169j) using Cartesian index notation. 



U X (v X w) = €ij k Uj (eklmVlW m ) , (6.172) 
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iijk^klmUjVlW m , 


(6.173) 


ekijtklmUjVlWm, 


(6.174) 


{Sildjm - SimSjl) UjVlW m , 


(6.175) 


UjViWj — UjVjWi, 


(6.176) 


UjWjVi — UjVjWi, 


(6.177) 


(u ■ w)v — (u • v)w. 


(6.178) 
1 



6.4 Calculus of vectors 

6.4.1 Vector function of single scalar variable 

If we have the scalar function 4>(t) and vector functions u(r) and v(r), some useful identities, 
based on the product rule, which can be proved include 



d du d<p 

-7-(<M) = <P~r + ^~ u ' 
dT dT dT 




-7- (<M) = 
dT 


dui dcj) 
= <p-7- + -j-Ui, 
dT dr 




(6.179) 


d T T dv du T 
dT dT dT 




d 

~T \ u i v i) = 
dr 


dvi dui 
= Ui— + -r-Vi, 

dT dT 




(6.180) 


d . . dv du 
— (u xv=ux— + — XV, 
dT dT dT 


d 
dr' 


(e ijk UjVk) = 


dv k 

- ^ijkUj~ t ^ijkVk 


duj 
~dV 


.(6.181) 



Here r is a general scalar parameter, which may or may not have a simple physical interpre- 
tation. 

6.4.2 Differential geometry of curves 

Now let us consider a general discussion of curves in space. If 

v(T)=x t (T)e i = x l (T), (6.182) 

then r(r) describes a curve in three-dimensional space. If we require that the basis vectors 
be constants (this will not be the case in most general coordinate systems, but is for ordinary 
Cartesian systems), the derivative of Eq. (16.1821) is 

d ^l = T >(T) = x' i (T)e i = x' i {T). (6.183) 

Now r'(r) is a vector that is tangent to the curve. A unit vector in this direction is 

y'(t) 
H r (T)ll2 
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where 



|r'(r) 



V X i X i- 



(6.185) 



In the special case in which r is time t, we denote the derivative by a dot ( ' ) notation 
rather than a prime (') notation; r is the velocity vector, ±i its components, and ||r||2 the 
magnitude. Note that the unit tangent vector t is not the scalar parameter for time, t. Also 
we will occasionally use the scalar components of t: t{, which again are not related to time 
t. 

Take s(t) to be the distance along the curve. Pythagoras' theorem tells us for differential 
distances that 



(6.186) 

(6.187) 
(6.188) 

(6.189) 

(6.190) 



CIS — ClX-i ~t~ GjXo ~t~ tZx o . 


ds = \l dx\ + dx\ + dx\, 


ds = 


\dxi\\ 2 , 


ds 


dxi 




dt 


dt 


• 

2 



|r(t)|| 2 , 



so that 



dx 



dv 



r 2 



at . j. 



dt 

at 



dr { 
ds 



Also integrating Eq. (16. 1901) with respect to t gives 



\r{t)\\ 2 dt 



\AiiXj •) KAjtAj 1 



dt 



dxidxi dx2 dx2 dx% dx^ 



dt dt J a V dt dt dt dt 

to be the distance along the curve between t = a and t = b. 



dt dt 



(6.191) 



dt, (6.192) 



I 

Example 6.10 
If 

r(i)=2i 2 i + i 3 j, 
find the unit tangent at t = 1, and the length of the curve from t = to t = 1. 



The derivative is 



t(t) =4ti + 3t 2 j. 



At t= 1, 



r(t=l) = 4i + 3j 

so that the unit vector in this direction is 



4. 3. 

t = -i+ -J 

5 5 J 



(6.193) 

(6.194) 
(6.195) 



(6.196) 
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Figure 6.4: Sketch for determination of radius of curvature. 



The length of the curve from t = to t = 1 is 



\/l6t 2 + 9t A dt, 



^(16 + 9t 2 ) 3/2 li, 

61 
27' 



(6.197) 
(6.198) 
(6.199) 

I 



In Fig. 16.41 r(t) describes a circle. Two unit tangents, t and t are drawn at times t and 
t + At. At time t we have 

t = — sin 6 i + cos 6 j. 

At time t + At we have 



t = - sin (6 + A6) i + cos (0 + A0) j. 
Expanding Eq. (I6.20ip in a Taylor series about A0 = 0, we get 

t = (- sin 6 - A0 cos 6 + 0(A0) 2 ) i + (cos 6 - A9 sin 6 + 0{ A0) 2 ) j, 

so as A0 — > 0, 

t-t = -A0cos0 i- A0sin0 j, 
At = A0(-cos0 i-sin0 j). 



(6.200) 

(6.201) 

(6.202) 

(6.203) 
(6.204) 



unit vector 
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It is easily verified that At T • t = 0, so At is normal to t. Furthermore, since — cos 9\ — sin 6j 
is a unit vector, 

(6.205) 
(6.206) 



Now for A0 -> 0, 



|At|| 2 = A0. 



As = pA6. 



where p is the radius of curvature. So 



Thus, 



Taking all limits to zero, we get 



lAtl 



At 



As 



dt 



ds 



As 



1 

2 _ P 

1 
2 P' 



(6.207) 



(6.208) 



(6.209) 



The term on the right side of Eq. ( I6.209J) is often defined as the curvature, k: 

1 

K= —. 

P 



(6.210) 



Thus, the curvature k is the magnitude of dt/ds; it gives a measure of how the unit tangent 
changes as one moves along the curve. 



6.4.2.1 Curves on a plane 

The plane curve y = f(x) in the x-y plane can be represented as 

r{t) = x(t)i + y(t)j, 
where x(t) = t and y(t) = f(t). Differentiating, we have 

i(t) = x(t)i + y(t)j. 
The unit vector from Eq. (J6.184P is 

xi + yj 

(x 2 + y 2 ) 1/2 ' 

i + y'j 



(6.211) 
(6.212) 

(6.213) 
(6.214) 



(i + (y0 2 ) 1/2 ' 
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where the primes are derivatives with respect to x. Since 



ds = dx + dy , 

1 /9 

ds = [dx 2 + dy 2 ) , 

J. = —{dx 2 + dy 2 ) l/ \ 

ds 
dx 



(1 + (2/) 2 ) 1/2 , 



(6.215) 
(6.216) 

(6.217) 
(6.218) 



we have, by first expanding dt/ds with the chain rule, then applying the quotient rule to 
expand the derivative of Eq. (j6.214p along with the use of Eq. (J6.218]) . 



dt 
ds 



da- 
ds 
dx 



(i + (y'?) 1/2 y"i - (J + j/jXi + {y') 2 )- l/2 y'y" 

1 + (y') 2 



(i + (2/0 2 ) 1/2 



dt/dx 

-y'i+j 



V 

l/(ds/dx) 



;i + ( y /)2)3/2 (1 + ^)2)1/2 



(6.219) 
(6.220) 

(6.221) 



As the second factor of Eq. (16.22ip is a unit vector, the leading scalar factor must be the 
magnitude of dt/ds. We define this unit vector to be n, and note that it is orthogonal to 
the unit tangent vector t: 



n T t 



-jn + j 



i + yj 



(1 + (y') 2 ) 1/2 (1 + (y') 2 ) 1/2 

-y' + y' 
i + Q/0 2 ' 
o. 



(6.222) 

(6.223) 
(6.224) 



Expanding our notion of curvature and radius of curvature, we define dt/ds such that 

dt 
ds 



KXl, 



dt 
ds 



3.225) 
3.226) 



Thus, 



(l + (2/) 2 ) 3/2 ' 
(1 + (2/) 2 ) 3/2 



(6.227) 
(6.228) 



for curves on a plane. 
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6.4.2.2 Curves in three-dimensional space 

We next expand these notions to three-dimensional space. A set of local, right-handed, 
orthogonal coordinates can be defined at a point on a curve r(t). The unit vectors at this 
point are the tangent t, the principal normal n, and the binormal b, where 

dv 

t = — 6.229 

ds 

ldt . 

n = HH' < 6 - 230 > 

b = t x n. (6.231) 

We will first show that t, n, and b form an orthogonal system of unit vectors. We have 
already seen that t is a unit vector tangent to the curve. By the product rule for vector 
differentiation, we have the identity 



dt 
t T ■ — = ^-f(t T -t). (6.232) 



ds 




Since t T ■ t = I Itl \i = 1, we recover 



lT 



• ^ = 0. (6.233) 

ds 



Thus, t is orthogonal to dt/ds. Since n is parallel to dt/ds, it is orthogonal to t also. From 
Eqs. ( 16.209!) and (16.230p . we see that n is a unit vector. Furthermore, b is a unit vector 
orthogonal to both t and n because of its definition in terms of a cross product of those 
vectors in Eq. (I6.231|) . 

Next, we will derive some basic relations involving the unit vectors and the characteristics 
of the curve. Take d/ds of Eq. f[fT23B : 

(6.234) 
(6.235) 

(6.236) 
(6.237) 

(6.238) 
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db 
ds 


d , 
= ^ (tXn) ' 




dt dn 
= — x n _, -ft x — , 
ds ^-^ ds 

(l/K)dt/ds 




dt 1 dt dn 

= — x htx , 

ds k ds ds 




1 dt dt dn 

= x htx — , 

k ds ds ds 




=0 




dn 

ds 
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So we see that db/ds is orthogonal to t. In addition, since ||b||2 = 1, 

= J^ONS, (6-240) 

= - 2Ts d\ (6.241) 

= 0. (6.242) 

So db/ds is orthogonal to b also. Since db/ds is orthogonal to both t and b, it must be 
aligned with the only remaining direction, n. So, we can write 

— = m, (6.243) 

ds 

where r is the magnitude of db/ds, which we call the torsion of the curve. 



From Eq. (16.2311) 


it is easily 


deduced that n = b x t,. Differentiating 


this with respect 


to s, we get 
















dn 
ds 


db dt 

— x t + b x — , 
ds ds 




(6.244) 






= 


rn x t + b x kii, 




(6.245) 






= 


— rb — nt. 




(6.246) 


Summarizing 
















dt 
ds 


= /en, 




(6.247) 






dn 
ds 


= — nt — rb, 




(6.248) 






db 
ds 


= rn. 




(6.249) 


These are the Frenet- 


-Serreto relations. 


In matrix form, we can say 


that 






ds \ 


:;)■ 


( °« "A (I) 




(6.250) 


Note the coefficient matrix is anti-symmetric. 







r 

Example 6.11 

Find the local coordinates, the curvature, and the torsion for the helix 



r(i) = a cost i + asint j + bt k. (6.251) 



^Jcan Frederic Frenet, 1816-1900, French mathematician, and Joseph Alfred Serret, 1819-1885, French 
mathematician. 
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Taking the derivative and finding its magnitude we get 

dr(t) 



di 



dr(t) 



di 



This gives us the unit tangent vector t: 



dr 

dt 



- a sin t i + a cos t j + b k, 



V a 2 sin t + a 2 cos 2 t + b 2 , 



dt II 2 



Va 2 + & 2 - 



- a sin 4 i + a cos £ j + b k 

Va 2 + b 2 ' 



We also have 


ds 
di 






IfdxV fdy\ 2 fdzV 

- fUJ + U; + UJ 




= va 2 sin t + a 2 cos 2 t + b 2 , 



V 'a 2 + b 2 . 



Continuing, we have 



dt 
ds 



M 
di 
ds 

ell 



cos t i + sin t j 1 



a 2 + 6 2 



Va 2 + b 2 Va 2 + b 2 ' 
(— cosi i — sini j), 



Thus, the unit principal normal is 



The curvature is 



n = — (cosi i + sini j). 





" a 2 + b 2 ' 


The radius of curvature is 






a 2 + b 2 




P- 




a 


We also find the unit binomial 




b = 


t x n, 




1 


i j k 




— - 




— asini acosi b 
— cos t — sin t 






Va 2 + b 2 


) 




b sin t i — b cos t j + a k 






Vc 


i 2 + b 2 




\CC BY-NC-1 


VD.| 



(6.252) 
(6.253) 
(6.254) 

(6.255) 



(6.256) 

(6.257) 
(6.258) 



(6.259) 

(6.260) 
(6.261) 

(6.262) 

(6.263) 
(6.264) 

(6.265) 

(6.266) 

(6.267) 

(6.268) 
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The torsion is determined from 



f > (6.269) 

dt 

, cos t i + sin £ j _ . 

^Tj(-cosii-sintj), (6.271) 



from which 



6 2 ' 



.272) 



J 



Further identities which can be proved relate directly to the time parameterization of r: 

(6.273) 
dr d 2 r\ d 3 r , fi ,„„„,. 



dr d 2 r 


= Kv 3 b 


— x 


dt dt 2 





(t ■ r") 2 
-i '- = k, (6.275) 



where v = ds/dt. 

6.5 Line and surface integrals 

If r is a position vector, 

r = x&i, (6.276) 

then </)(r) is a scalar field, and u(r) is a vector field. 

6.5.1 Line integrals 

A line integral is of the form 

I = / u T • dr, (6.277) 

Jc 

where u is a vector field, and dr is an element of curve C '. If u = Ui, and dr = dxi, then we 
can write 

I = Ui dxi. (6.278) 

Jc 
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x 




Figure 6.5: Three-dimensional curve parameterized by x(t) = a cost, y(t) = asint, z(t) = bt, 
with a = 5, 6= 1, for t G [0,25]. 
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Figure 6.6: The vector field u = yzi + xyj + xzk and the curves a) x = y 2 = z; b) x = y = z. 



I 

Example 6.12 
Find 



if 



Jc 



dr, 



u = yz i + xy j + xz k, 
and C goes from (0, 0, 0) to (1, 1, 1) along 

(a) the curve x = y 2 = z, 

(b) the straight line x = y = z. 

The vector field and two paths are sketched in Fig. 16.61 We have 



[ u T dr= [ 

Jc Jc 



{yz dx + xy dy + xz dz). 



(a) Substituting x = y 2 = z, and thus dx = 2ydy, dx = dz, we get 



y 3 {2y dy) + y 3 dy + y 4 (2y dy), 



(2y 4 + y 3 + 2y 5 )dy, 

5 4 3 

59 
60' 



(6.279) 
(6.280) 



(6.281) 

(6.282) 
(6.283) 
(6.284) 
(6.285) 
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We can achieve the same result in an alternative way that is often more useful for more curves 
whose representation is more complicated. Let us parameterize C by taking x = t, y = t 2 , z = t. Thus 
dx = dt, dy = 2tdt, dz = dt. The end points of C are at t = and t = 1. So the integral is 

1=1 (ft dt + tt 2 (2t) dt + t(t) dt, (6.286) 

Jo 

= I (t 3 + 2t 4 + t 2 ) dt, (6.287) 

Jo 



t A 2t 5 t 3 X 

59 
60' 



(6.288) 
(6.289) 



(b) Substituting x = y = z, and thus dx = dy = dz, we get 

L ,1 

(x 2 dx + x 2 dx + x 2 dx) = 3x 2 dx = x 3 \ 1 = 1. (6.290) 



Note a different value for / was obtained on path (b) relative to that found on path (a); thus, the 
integral here is path-dependent. 

I 



In general the value of a line integral depends on the path. If, however, we have the 
special case in which we can form u = V</> in Eq. (I6.277J) . where is a scalar field, then 

(V0) r -rfr, (6.291) 

c* 

^-dxi, (6.292) 

C V x i 

d(f>, (6.293) 

c 
= 0(b) -0(a), (6.294) 

where a and b are the beginning and end of curve C. The integral / is then independent of 
path, u is then called a conservative field, and (ft is its potential. 

6.5.2 Surface integrals 

A surface integral is of the form 

I = / u T • n dS = / UiUi dS (6.295) 

Js Js 

where u (or u.;) is a vector field, S is an open or closed surface, dS is an element of this 
surface, and n (or r^) is a unit vector normal to the surface element. 

\CC BY-NC-TW} 29 July 2012, Sen & Powers. 



208 CHAPTER 6. VECTORS AND TENSORS 

6.6 Differential operators 

Surface integrals can be used for coordinate-independent definitions of differential operators. 
Beginning with some well-known theorems: the divergence theorem for a scalar, the diver- 
gence theorem, and a little known theorem, which is possible to demonstrate, we have, where 
S is a surface enclosing volume V, 

' Vcf)dV = [ n<f> dS, (6.296) 

Js 



{v<}>)v = 


/ n0 dS, 
Js 
r 


(V T • u) V = 


/ n T • u dS, 

Js 
r 


(Vxu)V = 


/ n x u dS. 

Js 



V 1 ■ u dV = / n 1 ■ u dS, (6.297) 

v Js 

J (V x u) dV = J n x u dS. (6.298) 

Jv Js 

Now we invoke the mean value theorem, which asserts that somewhere within the limits of 
integration, the integrand takes on its mean value, which we denote with an overline, so 
that, for example, f v a dV = aV. Thus, we get 

(6.299) 

(6.300) 

(6.301) 

As we let V — ► 0, mean values approach local values, so we get 

V(p = grad <j> = lim — / n0 dS, (6.302) 

V T • u = div u = lim — / n T • u dS, (6.303) 

v^o V J s 

V x u = curl u = lim — / n x u dS, (6.304) 

v^o V J s 

where 0(r) is a scalar field, and u(r) is a vector field. V is the region enclosed within a 
closed surface S, and n is the unit normal to an element of the surface dS. Here "grad" is 
the gradient operator, "div" is the divergence operator, and "curl" is the curl operator. 

Consider the element of volume in Cartesian coordinates shown in Fig. 16.71 The differ- 
ential operations in this coordinate system can be deduced from the definitions and written 
in terms of the vector operator V: 

f) 8 8 I dxi \ a 

v=ei 2£r +, w»fer # )-a* (6 - 305) 
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Figure 6.7: Element of volume. 



We also adopt the unconventional, row vector operator 



V T = ( — — — 

\ dx\ 8x2 8x3 



(6.306) 

The operator V T is well-defined for Cartesian coordinate systems, but does not extend to 
non- orthogonal systems. 



6.6.1 Gradient of a scalar 

Let's evaluate the gradient of a scalar function of a vector 

grad {cj>{ Xi )). (6.307) 

We take the reference value of cj> to be at the origin O. Consider first the x\ variation. At 
O, X\ = 0, and our function takes the value of cj>. At the faces a distance X\ = ± dxi/2 away 
from O in the ^-direction, our function takes a value of 

dcj) dxi 



± 



Writing V = dxidx 2 dx 3 , Eq. (16.302p gives 

dcj) dx\ 



grad 



lim — 

v^oV 



dx l 2 



eidx 2 dxz 



dx\ 



dx\ 2 / \ d%\ 2 

+ similar terms from the x 2 and £3 faces ) , 



(6.308) 



eidx 2 dx 3 (6.309) 



d(f> 
dxi 
d<j> 
dxi 

Vcj). 



ei 



dcj) 

dx 2 

dcj) 
dxi 



e 2 



dcj) 
dx. 



-e 3 , 



(6.310) 

(6.311) 
(6.312) 
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The derivative of on a particular path is called the directional derivative. If the path 
has a unit tangent t , the derivative in this direction is 

(V</>) T -t = t 4 |^. (6.313) 

OXi 

If 4>(x, y, z) = constant is a surface, then deft = on this surface. Also 

d(j) = — — dxi, (6.314) 

OXi 

= (V<^) T -rfr. (6.315) 



Since dr is tangent to the surface, Vcj) must be normal to it. The tangent plane at r = r is 
defined by the position vector r such that 



(Vc/>) -(r-r ) = 0. (6.316) 



I 

Example 6.13 

At the point (1,1,1), find the unit normal to the surface 

.3 i 2 „,2 



Define 



A normal at (1,1,1) is 



The unit normal is 



z* + xz = x z + ij. (6.317) 



(x,y,z) = z 3 + xz-x 2 - y 2 = 0. (6.318) 



V0 = (z - 2x) i - 2y j + (3z 2 + x)k, (6.319) 

= -li-2j + 4k. (6.320) 



" - wk- (6 ' 321) 

'-li-2j + 4k). (6.322) 
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-10 12 




Figure 6.8: Plot of surface z 3 + xz = x 2 + y 2 and normal vector at (1, 1, 1). 

6.6.2 Divergence 
6.6.2.1 Vectors 



Equation ( I6.303|) becomes 
div u 



lim — 

v^o V 



a-\ 



du\ dx] 
~dx[~ 1 l 



dx 2 dxz 



Ul 



dui dx] 



dx ± 2 



+ similar terms from the x 2 and X3 faces j , 
dui du 2 dus 

i -I _ -I _ 

dxi 
dtii 
dxi 



dxo dx? 



?T 



U 



( _£L _£_ " 

\ dx\ 8x2 8x3 




dx 2 dx% (6.323) 

(6.324) 
(6.325) 

(6.326) 



6.6.2.2 Tensors 

The extension to tensors is straightforward 



divT 



V T -T, 



(6.327) 
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dT, 



ij 



dxi 



(6.328) 



Notice that this yields a vector quantity. 



6.6.3 Curl of a vector 



The application of Eq. ( I6.304J) is not obvious here. Consider just one of the faces: the face 
whose outer normal is e^ For that face, one needs to evaluate 



/ n x u 

Js 



dS. 



(6.329) 



On this face, one has n = e l5 and 



u= Mi 



dxi 



dxi G\ 



So, on this face the integrand is 



n x u 



ei 
1 



u 2 + - — dxi ) e 2 



Co 





du 3 
u 3 + Tj—dxx ) e 3 . 



e 3 





(6.330) 



(tii + fedxx) (« 2 +f^*i) («3 + fe«fai 

du 3 



U 2 + -fl— dx ^ ) e 3 



«3 



dxi 



dxi e 2 



(6.331) 



(6.332) 



Two similar terms appear on the opposite face, whose unit vector points in the — ei direction. 
Carrying out the integration then for equation f)6.304|) . one gets 



curl u 



lim — I I u 2 + tt-^^t ) e 3dx 2 dx 3 
v^o V \\ dx 1 2 



du 2 dxi . 
u 2 — — e 3 dx 2 dx 3 



"3 



du 3 dx\ , 

u 3 + — - e 2 dx 2 dx 3 (6.333) 

ox\ 2 

du 3 dx\ 



dx\ 2 / \ dx\ 2 

+ similar terms from the x 2 and x 3 i^ 



ei e 2 e 3 

ddd 
dxi 8x2 dxg 
Ml U 2 U 3 



e 2 dx 2 dx 3 



du h 



^ijk" 



dXj 

= V x u. 
The curl of a tensor does not arise often in practice. 



(6.334) 

(6.335) 
(6.336) 
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6.6.4 Laplacian 

6.6.4.1 Scalar 

The Laplacianj is simply div grad, and can be written, when operating on (f>, as 

d 2 6 



div grad 



V 7 • (V<f>) = V 



dxidxi 



6.6.4.2 Vector 

Equation (J6.346P is used to evaluate the Laplacian of a vector: 

V 2 u = V T • Vu = V(V T • u) - V x (V x u) 



(6.337) 



(6.338) 



6.6.5 Identities 



V x (V0) = 


0. 


(6.339) 


V T • (V x u) = 





(6.340) 


V T • (0u) = 


0V T -u + (V^) T -u, 


(6.341) 


V x (0u) = 


0V x u + V0 x u, 


(6.342) 


V T • (u x v) = 


v T • (V x u) - u T • (V x v), 


(6.343) 


V x (u x v) = 


(v T • V)u - (u T • V)v + u(V T • v) - v(V T • u), 


(6.344) 


V(u T -v) = 


(u T • V)v + (v T • V)u + u x (V x v) + v x (V x u), 


(6.345) 


VV T u = 


V(V T -u)- Vx (Vxu). 


(6.346) 


1 

Example 6.14 






Show that Eq. (|6. 346ft 






V • V T u = V(V T • u) - V x (V x u). 


(6.347) 


is true. 






Going from right to left 




V(V T ■ u) - 


d duj d ( du m \ 
Vx(Vxu) = e ijk Uwm„ ), 


(6.348) 




d duj d f dujn^ 

OXi OXj OXj \ OXi J 


(6.349) 




d2u 3 tx x x x \ d2um 

= ^ Q (OilOjm im 0jl , 

OXiOXj axjOxi 


(6.350) 



3 Pierre-Simon Laplace, 1749-1827, Normandy-born French mathematician. 
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d 2 u 
dxidxj 


d 2 Uj 
dxjdxi 


+ 


d ( dui 
dxj \dxj 


)• 




V T • Vu. 







d 2 Uj 
dxj dxj ' 



(6.351) 

(6.352) 
(6.353) 



6.6.6 Curvature revisited 

If a curve in two-dimensional space is given implicitly by the function 

cf>{x,y)=0, (6.354) 

it can be shown that the curvature is given by the formula 



\WVM\J' 



K ^ V ' { W ) ' (6 ' 365) 

provided one takes precautions to preserve the sign as will be demonstrated in the following 
example. Note that V<^> is a gradient vector which must be normal to any so-called level set 
curve for which <f> is constant; moreover, it points in the direction of most rapid change of (p. 
The corresponding vector V</>/||V0||2 must be a unit normal vector to level sets of 0. 



I 

Example 6.15 

Show Eq. (|6.355|) is equivalent to Eq. (|6.227|) if y = f(x). 

Let us take 

( j ) (x,y) = f(x)-y = 0. (6.356) 

Then, with ' denoting a derivative with respect to x, we get 

** - £'+&■ <""> 

= /'(aOi-J. (6-358) 

||V0|| 2 = V/W + l. (6-359) 

V0 f(x)i-i 



We then see that 



so that 
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Then we see that by applying Eq. (|6.355[) . we get 

K = v '(»)' (6 - 361) 






\ R ( _1 \ 

(6.363) 



v ' 

=0 
y/1 + f'(x)*f"(x) - f'(x)f'(x)f"(x) (1 + f(x) 2 y l/2 

1 + f(x) 2 
(1 + f'(x) 2 ) f"(x) - f'(x)f'(x)f"(x) 

(1 + /'(X)2) 3 / 2 

f"(x) 



(1 + /'(X)2) 3 / 2 ' 



(6.364) 
(6.365) 
(6.366) 



Equation (|6.366[) is fully equivalent to the earlier developed Eq. (|6.227[) . Note however that if we had 
chosen <f>(x, y) = y — f(x) = 0, we would have recovered a formula for curvature with the opposite sign. 



Considering now surfaces embedded in a three dimensional space described parametrically 
by 

(j)(x,y,z) = 0. (6.367) 

It can be shown that the so-called mean curvature of the surface km is given by Eq. (J6.355P : 

k »^-(m) <6 ' 368) 

Note that their are many other measures of curvature of surfaces. 

Lastly, let us return to consider one- dimensional curves embedded within a high dimen- 
sional space. The curves may be considered to be defined as solutions to the differential 
equations of the form 

§ = v(x). (6.369) 

We can consider v(x) to be a velocity field which is dependent on position x, but independent 
of time. A particle with a known initial condition will move through the field, acquiring a 
new velocity at each new spatial point it encounters, and thus tracing a non-trivial trajectory. 
We now take the velocity gradient tensor to be F, with 

F = Vv T . (6.370) 
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With this, it can then be shown after detailed analysis that the curvature of the trajectory 
is given by 

J(v T • F • F T • v)(v T • v) - (v T • F T • v) 2 
« = — ^7^ 6.371) 

(v T • v) 3/2 

In terms of the unit tangent vector, t = v/||v|| 2 , Eq. (16.371ft reduces to 

v/(t r • F • F T • t) - (t r • F T • t) 2 



K 



V 2 



(6.372) 



I 

Example 6.16 

Find the curvature of the curve given by 



dx 
~di 


= 


-y, a;(0) = 0, 








(6.373) 


dy 

dl 


= 


x, i/(0) = 2. 








(6.374) 


can of course solve this exactly by 


first dividing one equation 


by 


the other to 


get 




dy _ 
dx 


X 

• 

v 


y(x = 0) = 2. 








(6.375) 


ing variables, we get 


ydy 

y 2 

2 
2 2 
2 
C 


= —xdx, 
x 2 

o 2 

= 2. 








(6.376) 

(6.377) 

(6.378) 
(6.379) 



Thus, 

x 2 +y 2 = A, (6.380) 

is the curve of interest. It is a circle whose radius is 2 and thus whose radius of curvature p = 2; thus, 
its curvature k = 1/ ' p = 1/2. 

Let us reproduce this result using Eq. (|6.37ip . We can think of the two-dimensional velocity vector 
as 

u(x,y)\ _ ( -y 



(6.381) 
v{x, y) I \ x 

The velocity gradient is then 

F = Vv T = (f )(u(x,y) v{x,y))=(i gf J = ( \ j) . (6.382) 

Now, let us use Eq. (|6.371|) to directly compute the curvature. The simple nature of our velocity field 
induces several simplifications. First, because the velocity gradient tensor here is antisymmetric, we 
have 



v ^. F ^. v= (_ y x) ^ )[ ! ;)=(-y x )[_ y ) =xy-xy = Q. 
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Second, we see that 



,T /0 l\/0-l\ (1 



F F -i-i o • 1 o =0 i)= L ( 6 - 384 ) 



So for this problem, Eq. (|6.37ip reduces to 



/(v T • F_F^-v){v T ■ v) - ( y T ■ F T ■ v ) 2 
i =o 



(yT . v ) 3 / 2 



v/(v r -v)(v r -v) 
(v T • v) 3/2 

(v T -v) 
(v^v) 3/2 ' 

1 



v v T ■ v 

1 



|V||2 

1 



\Jx 2 + y- 



74' 
1 

2' 



(6.385) 
(6.386) 

(6.387) 

(6.388) 
(6.389) 
(6.390) 
(6.391) 
(6.392) 

I 



6.7 Special theorems 

6.7.1 Green's theorem 

Let u = u x i + u y j be a vector field, C a closed curve, and D the region enclosed by C, all 
in the x-y plane. Then 

u T • dv = If f^L - ^l) dx dy. (6.393) 

J J d \ ox dy J 



I 

Example 6.17 

Show that Green's theorem is valid if u = y i + 2xy j, and C consists of the straight lines (0,0) to 

(1,0) to (1,1) to (0,0). 

j u T dr= u T ■ dr + / u T ■ dr + u T ■ dr, (6.394) 
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Figure 6.9: Sketch of vector field u = yi + 2xyj and closed contour integral C. 



where C\, C<i, and Cj, are the straight lines (0,0) to (1,0), (1,0) to (1,1), and (1,1) to (0,0), respectively. 
This is sketched in Figure EH 



For this problem we have 

d : y = 0, dy = 0, x€ [0,1], u = Oi + Oj, (6.395) 

C 2 : x = l, dx = 0, ye [0,1], u = yi + 2yj, (6.396) 

C 3 : x = y, dx = dy, ire [1,0], yG [1,0], u = xi + 2a; 2 j. (6.397) 



Thus, 



•1 pi pO 

' (0i + 0j)- (da; i)+ / (y i + 2yj)-(dyj)+ / (x i + 2a; 2 j) • (da; i + dx j), (6.398) 



o 



i) 



C\ C-2 

1 pO 

2y dy + / (x + 2x 2 ) dx, 
Ji 

o 



2I 1 i I -*-„2 , ^ 3 



y L + —x~ H — x 
y l0 \2 3 

1 

~6' 



1 



1 2 

2 ~ 3' 



On the other hand, 



du y du x 

\ a ~ -IT- dx d y 
D \dx dy 



(2y - 1) dy dx, 



o Jo 

l 



(y 2 -2/)lo) d - T ' 



/' 

Jo 



(x 2 — x) dx, 



(6.399) 
(6.400) 
(6.401) 

(6.402) 
(6.403) 
(6.404) 



ICC BY-NC-TJDl 29 July 2012, Sen & Powers. 



6.7. SPECIAL THEOREMS 219 



(6.405) 

(6.406) 
(6.407) 



3 

1 

3 ~ 2' 
1 

~6' 



J 



6.7.2 Divergence theorem 

Let us consider Eq. (16.3001) in more detail. Let S be a closed surface, and V the region 
enclosed within it, then the divergence theorem is 

(6.408) 

(6.409) 

where dV an element of volume, dS is an element of the surface, and n (or m) is the outward 
unit normal to it. The divergence theorem is also known as Gauss's theorem. It extends to 
tensors of arbitrary order: 

T ijk mdS= J - 1 ^dV. (6.410) 



I u T • n dS = 


/ V T -udV, 


s 


Jv 


f 


f dU: 


/ UiiiidS = 
Js 





' S JV 

Note if Tjj/..., = C, then we get 



riidS = 0. (6.411) 

's 

The divergence theorem can be thought of as an extension of the familiar one- dimensional 
scalar result: 

(f){b) - <f>(a) = -j- dx. (6.412) 

Here the end points play the role of the surface integral, and the integral on x plays the role 
of the volume integral. 



I 

Example 6.18 

Show that the divergence theorem is valid if 

u = x i + y j + Ok, (6.413) 

and S is the closed surface which consists of a circular base and the hemisphere of unit radius with 
center at the origin and z > 0, that is, 

x 2 +y 2 + z 2 = 1. (6.414) 
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z 0.0 - 



Figure 6.10: Sketch depicting x + y + z = 1, z > and vector field u = x'\ + yj + Ok. 



In spherical coordinates, defined by 



x 

y 

z 



the hemispherical surface is described by 



r sin 9 cos 0, 
r sin (9 sin</>, 
r cos 8, 

r = 1. 



A sketch of the surface of interest along with the vector field is shown in Figure 16.101 
We split the surface integral into two parts 



J u T -ndS = J u T • n dS + f u T ■ n dS, 

Js Jb Jh 



(6.415) 
(6.416) 
(6.417) 

(6.418) 



(6.419) 



where B is the base and H the curved surface of the hemisphere. 

The first term on the right is zero since n = — k, and u T ■ n = on B. In general, the unit normal 
pointing in the r direction can be shown to be 



e r = n = sin 8 cos <fA + sin 8 sin </>j + cos #k. 
This is in fact the unit normal on H. Thus, on H, where r = 1, we have 
(xi + y j + 0k) ■ (sin 8 cos <jA + sin 9 sin <pj + cos #k) , 



T 

u ■ n 



(r sin 8 cos <f>\ + r sin 8 sin <j>] + 0k) • (sin 9 cos <jA + sin 9 sin <j>] + cos #k) , 



r sin 9 cos i 



r sin 8 sin i 



l l 

sin 8 cos <j> + sin 8 sin </>, 

„;„2 a 



.420) 



(6.421) 

(6.422) 
(6.423) 

(6.424) 
(6.425) 
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6.426) 







,-2iz />7r/2 


u ■ n dS 




/ / sin 2 9 {sm6 de d(j)), 

" T n dS 

[■2lV r-TT/2 




— 


/ / sin 3 e de d(f>, 

Jo Jo 

,.2iz r ir/2 /n i \ 

/ / f -sine- -sin36M de dcj) 




= 




= 


f.TT/2 /g 1 \ 

2tt / ( - sin 9 - - sin 3(9 j de, 




= 


4 




= 


-7T. 

3 



On the other hand, if we use the divergence theorem, we find that 

so that 

t f 2 4 

\7 T udV = 2 dV = 2-7T = -7T, 

»y iy 3 3 

since the volume of the hemisphere is (2/3)7r. 



6.427) 
6.428) 
6.429) 
6.430) 
6.431) 

6.432) 
6.433) 

I 



6.7.3 Green's identities 

Applying the divergence theorem, Eq. ( 16.4091) . to the vector u = 0V?/>, we get 

! 0(VVO T • n dS = j V r • (0W) dV, (6.434) 

fA.dS = j°(J*)w. (6.435) 

From this, we get Green's first identity 

I (j)(Vi>) T ■ n dS = f (cj)V 2 i> + {V(f)) T ■ Vi>) dV, (6.436) 

is Jv 

/ ( t } ^~ n i dS = / ^ q a + ^~ ^~ dv: (6.437) 

Interchanging (f) and -0 in Eq. (16.4361) . we get 

/ V(V0) r • n dS = j (0V 2 + (V^) T • V<f>) dV, (6.438) 

A, dS - / (* «£- + *t°±) d V. (6.439) 

dx^ J v \ dXidXi dxi dxi J 
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Subtracting Eq. fl6.438|) from Eq. (I6.436J) . we get Green's second identity 

f (<f)Vip - 4>V(j)) T ■ n dS = f (cj)V 2 i> - 4>V 2 (j)) dV, (6.440) 

f(*ir ~ #) "■ dS = / (*5tt- " */f) dV < M41 > 

J s \ axi dXij J v \ dxidxi dXidXij 

6.7.4 Stokes' theorem 

Consider Stokeso theorem. Let S be an open surface, and the curve C its boundary. Then 

V x u) T • n dS = f u T ■ dr, (6.442) 

Jc 

du k 

dxj j c 



e ijk^T iii dS = (f> Ui dr u (6.443) 



where n is the unit vector normal to the element dS, and dr an element of curve C. 



I 

Example 6.19 

Evaluate 

/= /(Vx u) T -ndS, (6.444) 

Js 

using Stokes's theorem, where 

u = x 3 j- (z + 1) k, (6.445) 

and S is the surface z = 4 — 4x 2 — y 2 for z > 0. 

Using Stokes's theorem, the surface integral can be converted to a line integral along the boundary 
C which is the curve 4 — Ax 2 — y 2 = 0. 

u T ■ dr, (6.446) 

(x 3 j-(z + l)k)-(dxi + dyj), (6.447) 

V v ' " v ' 

u T dr 

x 3 dy. (6.448) 

c 

C can be represented by the parametric equations x = cost, y = 2sini. This is easily seen by direct 
substitution on C: 

4 -4a; 2 - y 2 =4- 4cos 2 i- (2sinf) 2 = 4 - 4(cos 2 t + sin 2 t) =4-4 = 0. (6.449) 

Thus, dy = 2cosi dt, so that 

cos 3 t (2 cost dt), (6.450) 

~d7~ 



■^George Gabriel Stokes, 1819-1903, Irish-born English mathematician. 
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Figure 6.11: Sketch depicting z = 4 — Ax 2 — y 2 and vector field u = x 3 j — (z + l)k. 



r 2-R 

2 I cos 4 £ dt. 



2tt 



(I 



1 1 3 X 

-cos4i+ -cos 2^+ - I dt, 
8 2 



,1 1 3 

2 — sin At + - sin It + -t 
32 4 8 



(6.451) 
(6.452) 

(6.453) 
(6.454) 



A sketch of the surface of interest along with the vector field is shown in Figure 16.111 The curve C is 
on the boundary z = 0. 

I 



6.7.5 Leibniz's rule 

If we consider an arbitrary moving volume V(t) with a corresponding surface area S(t) with 
surface volume elements moving at velocity Wf., Leibniz's rule, extended from the earlier 
Eq. (I1.293p . gives us a means to calculate the time derivatives of integrated quantities. For 
an arbitrary order tensor, it is 



Tj k ...(Xi,t) dV 
dt Jv(t) 



v(t) 



dT ]k „,{xi,t) 
dt 



dV 



ii, 



l w Tn Tjk....(x i ,t) dS. (6.455) 



S(t) 
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Note if Tjk...(Xi,t) = 1, we get 

-^ / (l)dV = [ ^-(l)dV+ [ n m w m (l)dS, (6.456) 

at JV(t) Jv(t) at Js(t) 

n m w m dS. (6.457) 



dt 



S(t) 



Here the volume changes due to the net surface motion. In one dimension Tjk...(xi, t) = f(x, t) 
we get 

d f x=h( t) rx=b(t) Qf db da 

* L,„ /(i - J) di = L ( „ i di + * /(6(i) ' " - * /(a(i) ' t] - (M58) 

Problems 

1. Find the angle between the planes 

3x - y + 2z = 2, 
x-2y = 1. 

2. Find the curve of intersection of the cylinders x 2 + y 2 = 1 and y 2 + z 2 = 1. Determine also the radius 
of curvature of this curve at the points (0,1,0) and (1,0,1). 

3. Show that for a curve r(t) 



T dt d 2 t 
ds ds 2 

dr T _ £r_ d 3 r 

ds ds 2 ds 3 


= K r, 


d 2 r T d 2 r 
ds 2 ' ds 2 


— T i 



where t is the unit tangent, s is the length along the curve, k is the curvature, and r is the torsion. 

4. Find the equation for the tangent to the curve of intersection of x = 2 and y = 1 + xzsmy 2 z at the 
point (2, 1, 7r). 

5. Find the curvature and torsion of the curve r(£) = 2ti + t 2 j + 2i 3 k at the point (2, 1, 2). 

6. Apply Stokes's theorem to the plane vector field u(x, y) = u x i + u y j and a closed curve enclosing a 
plane region. What is the result called? Use this result to find § c u T • dr, where u = —yi + xj and the 
integration is counterclockwise along the sides C of the trapezoid with corners at (0,0), (2,0), (2,1), 
and (1,1). 

7. Orthogonal bipolar coordinates (u,v,w) are defined by 

a sinh v 



cosh v — cos u 
a sin u 

cosh v — cos u ' 
w. 



V 



For a = 1, plot some of the surfaces of constant x and y in the u — v plane. 
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8. Using Cartesian index notation, show that 

V x (u x v) = (v T ■ V)u - (u T • V)v + u(V T • v) - v(V T • u), 

where u and v are vector fields. 

9. Consider two Cartesian coordinate systems: S with unit vectors (i, j, k), and S 1 with (i', j', k'), where 
i' = i, j' = (j — k)/\/2, k' = (j + k)/-\/2- The tensor T has the following components in S: 




Find its components in S". 

10. Find the matrix A that operates on any vector of unit length in the x-y plane and turns it through 
an angle around the z-axis without changing its length. Show that A is orthogonal; that is that all 
of its columns are mutually orthogonal vectors of unit magnitude. 

11. What is the unit vector normal to the plane passing through the points (1,0,0), (0,1,0) and (0,0,2)? 

12. Prove the following identities using Cartesian index notation: 

(a) (a x b) T ■ c = a T ■ (b x c), 

(b) a x (b x c) = b(a T • c) - c(a T ■ b), 

(c) (a x b) T ■ (c x d) = ((a x b) x c) T • d. 

13. The position of a point is given by r = iacostot + jbsmujt. Show that the path of the point is an 
ellipse. Find its velocity v and show that r x v = constant. Show also that the acceleration of the 
point is directed towards the origin and its magnitude is proportional to the distance from the origin. 

14. System S is defined by the unit vectors ei, e2, and e3. Another Cartesian system S' is defined by 
unit vectors e^, e 2 , and e 3 in directions a, b, and c where 

a = ei, 

b = e 2 - e 3 . 

(a) Find e' 1; e 2 , e 3 , (b) find the transformation array Ajj, (c) show that 5ij = A^iA^j is satisfied, and 
(d) find the components of the vector ei + e 2 + e 3 in S' . 

15. Use Green's theorem to calculate § c u T • dr, where u = x 2 'i + 2xyj, and C is the counterclockwise 
path around a rectangle with vertices at (0,0), (2,0), (0,4) and (2,4). 

16. Derive an expression for the gradient, divergence, curl, and Laplacian operators in orthogonal para- 
boloidal coordinates 



X 


- 


uv cos (f, 


11 


= 


uvsmO, 


z 


= 


2 V ; 



Determine the scale factors. Find V<^>, V T • u, V x u, and V 2 <^> in this coordinate system. 
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17. Derive an expression for the gradient, divergence, curl and Laplacian operators in orthogonal parabolic 
cylindrical coordinates (u,v,w) where 

X = uv, 

y = 2^ ~ v )' 

z = w, 

where u G [0,oo), v G (—00,00), and w G (— oo,oo). 

18. Consider orthogonal elliptic cylindrical coordinates {u, v, z) which are related to Cartesian coordinates 
(x,y,z) by 

x = a cosh u cos v 
y = a sinh u sin v 

z = z 

where u G [0, oo), v G [0, 2w) and z G (—00, oo). Determine V/, V T • u, V x u and V 2 / in this system, 
where / is a scalar field and u is a vector field. 

19. Determine a unit vector in the plane of the vectors i — j and j + k and perpendicular to the vector 
i-j + k. 

20. Determine a unit vector perpendicular to the plane of the vectors a = i + 2 j — k, b = 2i+j + Ok. 

21. Find the curvature and the radius of curvature of y = a sin a; at the peaks and valleys. 

22. Determine the unit vector normal to the surface x 3 — 2xyz + z 3 = at the point (1,1,1). 

23. Show using indicial notation that 

= 0, 



(u T • V)v + (v T ■ V)u + u x (V x v) + v x (V x u), 

(u T • V)u + u x (V x u), 

v T • V x u — u T • V x v, 

V(V T • u) - V 2 u, 

(v T ■ V)u - (u T • V)v + u(V T ■ v) - v(V T • u). 

24. Show that the Laplacian operator a a has the same form in S and S . 

25. If 

(X\x\ 32) 3 X\ — Xi 

x 2 x 1 XxX 3 x\ + 1 
4 2x 2 - x 3 , 

a) Evaluate T^ at P : (3, 1, 2), 

b) find T<ij) and Tuj-i at P, 

c) find the associated dual vector di, 

d) find the principal values and the orientations of each associated normal vector for the symmetric 
part of Tij evaluated at P, 

e) evaluate the divergence of Tij at P, 

f) evaluate the curl of the divergence of Xy at P. 
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Vx 


V<^ 


V T - V 


< u 


V(u T 


v) 


2 v 


u) 


V T • (u x 


v) 


V x (V x 


u) 


V x (u x 


v) 
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26. Consider the tensor 



-* './' 




defined in a Cartesian coordinate system. Consider the vector associated with the plane whose normal 
points in the direction (2,5,-1). What is the magnitude of the component of the associated vector 
that is aligned with the normal to the plane? 

27. Find the invariants of the tensor 

T-( l 2 
v \2 2 

28. Find the tangent to the curve of intersection of the surfaces y 2 = x and y = xy at (x, y, z) = (1, 1, 1). 
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Chapter 7 
Linear analysis 



see Kaplan, Chapter 1, 

see Friedman, Chapter 1, 2, 

see Riley, Hobson, and Bence, Chapters 7, 10, 15, 

see Lopez, Chapters 15, 31, 

see Greenberg, Chapters 11 and 18, 

see Wylie and Barrett, Chapter 13, 

see Michel and Herget, 

see Zeidler, 

see Riesz and Nagy, 

see Debnath and Mikusinski. 

This chapter will introduce some more formal notions of what is known as linear analysis. 
We will generalize our notion of a vector; in addition to traditional vectors which exist within 
a space of finite dimension, we will see how what is known as function space can be thought 
of a vector space of infinite dimension. This chapter will also introduce some of the more 
formal notation of modern mathematics. 

7.1 Sets 

Consider two sets A and B. We use the following notation 

x G A, a; is an element of A, 

x $l A, a; is not an element of A, 

A = B, A and B have the same elements, 

A C B, the elements of A also belong to B, 

A U B, set of elements that belong to A or B, 

A n B, set of elements that belong to A and B, and 

A — B, set of elements that belong to A but not to B. 

If A C B, then B — A is the complement of A in B. 

229 
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Some sets that are commonly used are: 

Z, set of all integers, 

N, set of all positive integers, 

Q, set of all rational numbers, 

R, set of all real numbers, 

R + , set of all non- negative real numbers, and 

C, set of all complex numbers. 

• An interval is a portion of the real line. 

• An open interval (a,b) does not include the end points, so that if x G (a, b), then 
a < x < b. In set notation this is {x G R : a < x < b} if x is real. 

• A closed interval [a, b] includes the end points. If x G [a, b], then a < x < b. In set 
notation this is {x G R : a < x < b} if x is real. 

• The complement of any open subset of [a, b] is a closed set. 

• A set A C R is bounded from above if there exists a real number, called the upper 
bound, such that every x G A is less than or equal to that number. 

• The least upper bound or supremum is the minimum of all upper bounds. 

• In a similar fashion, a set A C R can be bounded from below, in which case it will 
have a greatest lower bound or infimum. 

• A set which has no elements is the empty set {}, also known as the null set 0. Note 
the set with as the only element, 0, is not empty. 

• A set that is either finite, or for which each element can be associated with a member 
of N is said to be countable. Otherwise the set is uncountable. 

• An ordered pair is P = (x, y), where x G A, and y G B. Then P G A x B, where the 
symbol x represents a Cartesian product. If x G A and y G A also, then we write 
p = ( x ,y) G A 2 . 

• A real function of a single variable can be written as / : X — > Y or y = f(x) where / 
maps xGXcRtoyGYcR. For each x, there is only one y, though there may be 
more than one x that maps to a given y. The set X is called the domain of /, y the 
image of x, and the range the set of all images. 
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Figure 7.1: Riemann integration process. 

7.2 Differentiation and integration 

7.2.1 Frechet derivative 

An example of a Frechelo derivative is the Jacobian derivative. It is a generalization of the 
ordinary derivative. 

7.2.2 Riemann integral 

Consider a function f(t) defined in the interval [a, b]. Choose t\, t%, • • • , £jv-i such that 

a = t < ti < t 2 < ■ • ■ < t N -i <t N = b. (7.1) 

Let i n e [t n -i,t n ], and 

In = /(6)(*i " *o) + /(6)(*2 - ti) + • • • + f(Z N )(t N - tjv-i)- (7-2) 

Also let max n |t n — t n _i| — >■ as TV — > oo. Then Jjy — > /, where 



/(*) rf^- 



(7.3) 



If 7 exists and is independent of the manner of subdivision, then /(£) is Riemanro integrable 
in [a, 6]. The Riemann integration process is sketched in Fig. 17.11 



I 

Example 7.1 

Determine if the function f(t) is Riemann integrable in [0, 1] where 



/(*) 



0, if t is rational, 

1, if i is irrational. 



(7.4) 



1 Maurice Rene Frechet, 1878-1973, French mathematician. 

2 Georg Friedrich Bernhard Riemann, 1826-1866, Hanover-born German mathematician. 
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On choosing £„ rational, 1 = 0, but if £„ is irrational, then 1=1. So /(£) is not Riemann integrable. 



7.2.3 Lebesgue integral 

Let us consider sets belonging to the interval [a, b] where a and b are real scalars. The 
covering of a set is an open set which contains the given set; the covering will have a certain 
length. The outer measure of a set is the length of the smallest covering possible. The inner 
measure of the set is (b — a) minus the outer measure of the complement of the set. If the 
two measures are the same, then the value is the measure and the set is measurable. 

For the set / = (a,b), the measure is m(I) = \b — a\. If there are two disjoint intervals 
I\ = (a, b) and I2 = (c, d). Then the measure ot I = I\U I 2 is m(I) = \b — a\ + \c — d\. 

Consider again a function f(t) defined in the interval [a, b\. Let the set 

e n = {t : y n -i < f{t) < y n }, (7.5) 

(e n is the set of all t's for which f(t) is bounded between two values, y n -\ and y n ). Also let 
the sum ijy De defined as 

In = yim(ei) + y 2 m(e 2 ) ^ ^ y N m(e N ). (7.6) 

Let max n \y n — y n _i| —* as iV —> 00. Then 7jv -^ /, where 

I = J f(t) dt. (7.7) 

J a 

Here I is said to be the Lebesguqj integral of f(t). The Lebesgue integration process is 
sketched in Fig. E2 



I 

Example 7.2 

To integrate the function in the previous example, we observe first that the set of rational and 
irrational numbers in [0,1] has measure zero and 1 respectively. Thus, from Eq. (|7.6[) the Lebesgue 
integral exists, and is equal to 1. Loosely speaking, the reason is that the rationals are not dense in 
[0, 1] while the irrationals are dense in [0, 1]. That is to say every rational number exists in isolation 
from other rational numbers and surrounded by irrationals. Thus, the rationals exist as isolated points 
on the real line; these points have measure 0; The irrationals have measure 1 over the same interval; 
hence the integral is In = V\fn{e.\) + Vi'(n{e.2) = 1(1) + 0(0) = 1. 

I 



3 Henri Leon Lebesgue, 1875-1941, French mathematician. 
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Figure 7.2: Lebesgue integration process. 

The Riemann integral is based on the concept of the length of an interval, and the 
Lebesgue integral on the measure of a set. When both integrals exist, their values are the 
same. If the Riemann integral exists, the Lebesgue integral also exists. The converse is not 
necessarily true. 

The importance of the distinction is subtle. It can be shown that certain integral oper- 
ators which operate on Lebesgue integrable functions are guaranteed to generate a function 
which is also Lebesgue integrable. In contrast, certain operators operating on functions which 
are at most Riemann integrable can generate functions which are not Riemann integrable. 



7.2.4 Cauchy principal value 

If the integrand f(x) of a definite integral contains a singularity at x = x with x G (a, 6), 
then the Cauchy principal value is 



I f( x )dx = PV I f{x)dx = lim ( I 



f(x)dx 



f{x)dx 



x +e 



(7.8) 



7.3 Vector spaces 

A field F is typically a set of numbers which contains the sum, difference, product, and 
quotient (excluding division by zero) of any two numbers in the fieldjj Examples are the sets 
of rational numbers Q, real numbers, R, or complex numbers, C We will usually use only 
R or C Note the integers Z are not a field as the quotient of two integers is not necessarily 
an integer. 

Consider a set § with two operations defined: addition of two elements (denoted by +) 
both belonging to the set, and multiplication of a member of the set by a scalar belonging 



4 More formally a field is what is known as a commutative ring with some special properties, not discussed 
here. What is known as function fields can also be defined. 
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to a field F (indicated by juxtaposition). Let us also require the set to be closed under the 
operations of addition and multiplication by a scalar, i.e. if x G S, y G S, and a G F then 
x + y G §, and ax G S. Furthermore: 

1. V x, y G § : £ + y = y + x. For all elements x and ?/ in S, the addition operator on 
such elements is commutative. 

2. V x, y, z G § : (x + y) + z = x + (y + z). For all elements x and g/ in §, the addition 
operator on such elements is associative. 

3. BOgSJVxgS, x + = x: there exists a 0, which is an element of S, such that for 
all a; in § when the addition operator is applied to and x, the original element x is 
yielded. 

4. Va;G§, 3-x 6 § | i + (— x) = 0. For all x in § there exists an element —x, also in 
§, such that when added to x, yields the element. 

5. 3 1 G F | V x G §, Ix = x. There exists an element 1 in F such that for all x in §,1 
multiplying the element x yields the element x. 

6. V a, b G F, \/x G §, (a + b)x = ax + bx. For all a and b which are in F and for all x 
which are in S, the addition operator distributes onto multiplication. 

7. V a G F, V x, y G S, a(x + y) = ax + ay. 

8. V a, b G F, V x G §, a(bx) = {ab)x. 

Such a set is called a linear space or vector space over the field F, and its elements are 
called vectors. We will see that our definition is inclusive enough to include elements which 
are traditionally thought of as vectors (in the sense of a directed line segment), and some 
which are outside of this tradition. Note that typical vector elements x and y are no longer 
indicated in bold. However, they are in general not scalars, though in special cases, they can 
be. 

The element G § is called the null vector. Examples of vector spaces § over the field of 
real numbers (i.e. F : R) are: 

1. § : R 1 . Set of real numbers, x = X\, with addition and scalar multiplication defined as 
usual; also known as S : R. 

2. § : R 2 . Set of ordered pairs of real numbers, x = (xi,X2) T , with addition and scalar 
multiplication defined as: 

x + y= (llXll) ^i+^'^ + ^f, (7.9) 

( ax\ 
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where 



X l \ __ r„ ™ \T r- TD)2 / 2/i \ /„, „, \T 



^ I --=(x 1 ,X 2 feR 2 , y=l^)=(y u y 2 ) T eR 2 , U E R 1 . (7.11) 

Note I 2 = K 1 x I 1 , where the symbol x represents a Cartesian product. 

3. § : R N . Set of N real numbers, x = (x\, ■ ■ ■ , xn) t , with addition and scalar multipli- 
cation defined similar to that just defined in R 2 . 

4. § : R°°. Set of an infinite number of real numbers, x = (x\, x 2 , ■ ■ -) T , with addition and 
scalar multiplication defined similar to those defined for R N . Note, one can interpret 
functions, e.g. x = 3t 2 + /, t E R 1 to generate vectors x G R°°. 

5. S : C. Set of all complex numbers z = Z\, with z\ = cii + %b\\ Oi, b\ G R 1 . 

6. § : C 2 . Set of all ordered pairs of complex numbers z = (z±, z^) 1 ', with z\ = ai+ibi, z 2 = 
a 2 + ib 2 ; a±, a 2 , b x , b 2 G 1R 1 . 

7. § : C N . Set of A^ complex numbers, z = [z\, ■ ■ ■ , z^) T . 

8. § : C°°. Set of an infinite number of complex numbers, z = (z\, z 2 , ■ ■ -) T . Scalar 
complex functions give rise to sets in C°°. 

9. § : M. Set of all M x TV matrices with addition and multiplication by a scalar defined 
as usual, and M G N, N G N. 

10. § : C[a, b] Set of real- valued continuous functions, x(t) for t G [a, b] G R 1 with addition 
and scalar multiplication defined as usual. 

11. § : C N [a, b] Set of real- valued functions x(t) for t G [a, b] with continuous iV derivative 
with addition and scalar multiplication defined as usual; N G N. 

12. § : L 2 [a,fe] Set of real- valued functions x(t) such that x(t) 2 is Lebesgue integrable in 
t G [a, b] G R 1 , a < b, with addition and multiplication by a scalar defined as usual. 
Note that the integral must be finite. 

13. § : L p [o, b] Set of real- valued functions x(t) such that |x(/)| p , p G [l,oo), is Lebesgue 
integrable in t G [a, b] G R l ,a < 6, with addition and multiplication by a scalar defined 
as usual. Note that the integral must be finite. 

14. § : L p [a, 6] Set of complex-valued functions x(t) such that |x(t)| p , p G [1, oo) G R l , is 
Lebesgue integrable in t G [a, b] G IR^a < b, with addition and multiplication by a 
scalar defined as usual. 
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15. S : W\(G), Set of real- valued functions u(x) such that u(x) 2 and Yl n =i {du/dx n ) are 
Lebesgue integrable in G, where x £ G £ WL N , N £ N. This is an example of a Sobolovo 
space, which is useful in variational calculus and the finite element method. Sobolov 
space W 2 (G) is to Lebesgue space L 2 [a, b] as the real space R 1 is to the rational space 
Q 1 . That is Sobolov space allows a broader class of functions to be solutions to physical 
problems. See Zeidler. 

16. § : P Set of all polynomials of degree < TV with addition and multiplication by a 
scalar defined as usual; N £N. 

Some examples of sets that are not vector spaces are Z and N over the field R for the same 
reason that they do not form a field, namely that they are not closed over the multiplication 
operation. 

• §' is a subspace of § if §' C S, and §' is itself a vector space. For example M 2 is a 
subspace of R 3 . 

• If §i and §2 are subspaces of S, then §i n §2 is also a subspace. The set §1 + §2 of all 
Xi + X2 with X\ £ §1 and x 2 £ §2 is also a subspace of S. 

• If §1 + §2 = S, and Si n §2 = {0}, then § is the direct sum of §1 and S2, written as 
§ = §i©§ 2 - 

• If x±,x 2 , ■ • ■ , xn are elements of a vector space § and «i, a 2 , ■ ■ ■ , <%n belong to the field 
F, then x = (X\X\ + a 2 x 2 + ■ ■ ■ + ol^x^ £ § is a linear combination. 

• Vectors X\, x 2 , ■ ■ ■ , x^ for which it is possible to have a\X\ + 0^2X2 + • — h Q-nXn = 
where the scalars a n are not all zero, are said to be linearly dependent. Otherwise they 
are linearly independent. 

• For M < N, the set of all linear combinations of M vectors {x\,x 2 , ■ ■ ■ , Xm} of a vector 
space constitute a subspace of an TV-dimensional vector space. 

• A set of N linearly independent vectors in an N- dimensional vector space is said to 
span the space. 

• If the vector space § contains a set of N linearly independent vectors, and any set 
with (JV + 1) elements is linearly dependent, then the space is said to be finite dimen- 
sional, and N is the dimension of the space. If N does not exist, the space is infinite 
dimensional. 

• A basis of a finite dimensional space of dimension A" is a set of A linearly independent 
vectors {u±, u 2 , . . . , u^}. All elements of the vector space can be represented as linear 
combinations of the basis vectors. 



'Sergei Lvovich Sobolev, 1908-1989, St. Petersburg-born Russian physicist and mathematician. 
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• A set of vectors in a linear space § is convex iff Vx, y G § and a G [0, 1] G R 1 implies 
ax + (1 — a)y G S. For example if we consider § to be a subspace of R 2 , that is a 
region of the x, y plane, § is convex if for any two points in S, all points on the line 
segment between them also lie in §. Spaces with lobes are not convex. Functions / 
are convex iff the space on which they operate are convex and if /(ax + (1 — a)y) < 
af(x) + (1 - a)f(y) V x,y eS,ae [0, 1] G R 1 . 

7.3.1 Normed spaces 

The norm \\x\\ of a vector x G § is a real number that satisfies the following properties: 

1. \\x\\ > 0, 

2. | |x| | = if and only if x = 0, 

3. \\ax\\ = \a\ \\x\\, a G C 1 , and 

4. \\x + y\\ < ||x|| + ||y||, (triangle or Minkowsky inequality). 

The norm is a natural generalization of the length of a vector. All properties of a norm can 
be cast in terms of ordinary finite dimensional Euclidean vectors, and thus have geometrical 
interpretations. The first property says length is greater than or equal to zero. The second 
says the only vector with zero length is the zero vector. The third says the length of a scalar 
multiple of a vector is equal to the magnitude of the scalar times the length of the original 
vector. The Minkowski inequality is easily understood in terms of vector addition. If we add 
vector ially two vectors x and y, we will get a third vector whose length is less than or equal 
to the sum of the lengths of the original two vectors. We will get equality when x and y 
point in the same direction. The interesting generalization is that these properties hold for 
the norms of functions as well as ordinary geometric vectors. 
Examples of norms are: 

1. x G R 1 , ||x|| = \x\. This space is also written as ^i(R 1 ) or in abbreviated form £\. The 
subscript on £ in either case denotes the type of norm; the superscript in the second 
form denotes the dimension of the space. Another way to denote this norm is ||x||i. 



2. x G R 2 , x = (xi,x 2 ) r , the Euclidean norm ||x|| = ||x|| 2 = + \/x\ + x\ = +\/x T x. We 
can call this normed space E 2 , or ^(R 2 ), or l\. 



3. x G M. N , x = (x±,X2,--- ,xn) t , \\x\\ = \\x\\2 = +\fx\ + x 2 + • • • + x 2 N = +Vx T x. We 
can call this norm the Euclidean norm and the normed space Euclidean E , or £ 2 (M> N ) 

-2 



or £* 



4. x G M. N , x = (xi, X2, ■ ■ ■ ,xn) t , \\x\\ = \\x\\i = \x\\ + \x2\ + • • • + \xn\- This is also 
h(R N ) or £?. 



e Hermann Minkowski, 1864-1909, Russian/Lithuanian-born German-based mathematician and physicist. 
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5. x G M. N , x = (xi, X2, ■ ■ ■ , xmY ', \\x\\ = \\x\\ p = {\x\\ p + \x 2 \ p + • • • + I^at^) 1 ^, where 
1 < p < oo. This space is called or ^(R^) or £^ . 

6. x E R N , x = (xi,a;2, • • • ,x^) T , \\x\\ = ||x||oo = maxi< n <jv |^ n |- This space is called 



7. x e C N , x = (xi,x 2 , ■ ■ -,x N ) T , \\x\\ = \\x\\ 2 = +a/|xi| 2 + \x 2 \ 2 + h \x N \ 2 

+Vx T x. This space is described as £ 2 (C N ). 

8. x G C[a, b], | \x\ | = max a < t < b \x(t)\; t G [a, 6] el 1 . 

9. a; G C 1 ^, fe], ||x|| = max a <t<6 |x(t)| + max a <j<fe |x'(t)|; t 6 [a, t] G K 1 . 



10. x G L 2 [a,6], ||rr|| = \\x\\ 2 = +\J £ x(t) 2 dt; t G [a, b] G R 1 . 

11. x G L p [a,6], ||x|| = ||x|| p = + (£\x(t)\ p dt) P ; t G [a, 6] G R 1 . 



12. x GL 2 [a,6], ||x|| = ||x|| 2 = + yj£\x( y t)\ 2 dt = +J£x(t)x(t) dt; te [a,b] G R 1 . 

_ / , \l/p / , / \ p/2 \ Vp 

13. x G L p [a,6], ||x|| = ||ar|| p = + l£ \x(t)\ p dt) = + i £ lx(t)x{t)) dt) ; t G 

[a, 6] G R 1 . 



a; G 



14. u G W 2 (G), ||w|| = ||w||i,2 = +\ J G (u(x)u(x) + Y^n=i(^ u /^ x n){du/dx n )) dx; 

G G M. N , u G L 2 (G), du/dx n G L 2 (G). This is an example of a Sobolov space which is 
useful in variational calculus and the finite element method. 

Some additional notes on properties of norms include 

• A vector space in which a norm is defined is called a normed vector space. 

• The metric or distance between x and y is defined by d(x, y) = \\x — y\\. This a natural 
metric induced by the norm. Thus, ||x|| is the distance between x and the null vector. 

• The diameter of a set of vectors is the supremum (i.e. least upper bound) of the distance 
between any two vectors of the set. 

• Let §i and S 2 be subsets of a normed vector space § such that Si C S 2 . Then Si 
is dense in § 2 if for every x^ G § 2 and every e > 0, there is a x^ G Si for which 
WxW-xM\\<e. 



A sequence x^\x^ 2 \ ■ ■ ■ G S, where S is a normed vector space, is a CauchyQ sequence 
if for every e > there exists a number N e such that ||:c( m ) — o;( n )|| < e for every m 
and n greater than N e . 



7 Augustin-Louis Cauchy, 1789-1857, French mathematician and physicist. 
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• The sequence x^\x^ 2 \ ■ ■ ■ G S, where § is a normed vector space, converges if there 
exists an x G § such that linin^^ \\x^ — x\\ = 0. Then x is the limit point of the 
sequence, and we write lim n ^ 00 x^ = x or x^ — > x. 

• Every convergent sequence is a Cauchy sequence, but the converse is not true. 

• A normed vector space § is complete if every Cauchy sequence in § is convergent, i.e. 
if S contains all the limit points. 



• A complete normed vector space is also called a BanacnQ space. 

• It can be shown that every finite dimensional normed vector space is complete. 

• Norms || • || n and || • || m in § are equivalent if there exist a, b > such that, for any 

a\\x\\ m < \\x\\ n < 6||x|| m . (7-12) 

• In a finite dimensional vector space, any norm is equivalent to any other norm. So, 
the convergence of a sequence in such a space does not depend on the choice of norm. 

We recall that if z G C 1 , then we can represent z as z = a + ib where a G R 1 , b G R 1 ; 
further, the complex conjugate of z is represented as z = a — ib. It can be easily shown for 

z l eC\z 2 e C 1 that 



[Zi + z 2 ) = Z\ + z 2 , 



\Z\ — Z 2 ) — Z\ — z 2 , 
z\z 2 = zi~z 2 , and 



[ £1 ) = £1 
\z 2 J z%' 



We also recall that the modulus of 2, \z\ has the following properties: 



|2 



zz, (7.13) 

(a + i6)(o-i6), (7.14) 

a 2 + tab — iab — i 2 b 2 , (7-15) 

a 2 + b 2 > 0. (7.16) 



I 

Example 7.3 

Consider x G K 3 and take 

x =1-4 J. (7.17) 

s Stefan Banach, 1892-1945, Polish mathematician. 
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Find the norm if x G l\ (absolute value norm), x G £ 2 (Euclidean norm), if x = £ 3 (another norm), and 
if x G (^ (maximum norm) . 

By the definition of the absolute value norm for x G l\, 

IMI = IMIi = M + M + M, (7.18) 

we get 

||i||i = |1| + | - 4| + |2| = 1 + 4 + 2 = 7. (7.19) 

Now consider the Euclidean norm for x G £ 2 . By the definition of the Euclidean norm, 

\\x\\ = \\x\\ 2 + s lx\+x\+xl (7.20) 

we get 



\\ x \\ 2 = +v /l2 + (_4)2 + 2 2 = Vl + 16 + 4 = +V21- 4.583. (7.21) 

Since the norm is Euclidean, this is the ordinary length of the vector. 
For the norm, x G 1%, we have 



IMI = IM| 3 = + (M 3 + M 3 + H 3 ) 1/3 , (7.22) 

so 

||a;||3 = + (|1| 3 + I - 4| 3 + |2| 3 ) V3 = (1 + 64 + 8) 1/3 ~ 4.179 (7.23) 

For the maximum norm, x G f 3 ^, we have 

|N| = \\x\U = lim + (\ Xl \* + \x 2 \p + \x 3 \p) 1/p , (7.24) 

p^oo 
SO 

\\x\U = lim + (111" + I " 4| p + \2\n 1/p = 4. (7.25) 

p — *oo 

This selects the magnitude of the component of x whose magnitude is maximum. Note that as p 
increases the norm of the vector decreases. 

I 



I 

Example 7.4 

For x G ^2(C 2 ), find the norm of 



i !)-(;:« i 



The definition of the space defines the norm is a 2 norm ( "Euclidean" ) 



x = x 



|| 2 = +VTi= WxTzi + X2~x 2 = \J\xi\ 2 + \x 2 \ 2 , (7.27) 

Hx|| 2 =+W(0+ll T+0?)(J + Jj), (7.28) 
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||a:|| a = +y/fo + H)(0 + U) + (1 + Oz)(l + oi) = +\/{Q - li)(0 + U) + (1 - Oi)(l + Oi), (7.29) 

||2;||2 = +v / -« 2 + l = +\ / 2- (7.30) 

Note that if we were negligent in the use of the conjugate and defined the norm as INI2 = + Vx T x, 
we would obtain 



\X\\2 



-Vx^=+J{i l)( l A=+y/i2 + l = +y^TTT = 0! (7.31) 

This violates the property of the norm that ||a;|| > if a; 7^ 0! 



I 

Example 7.5 

Consider x G L 2 [0, 1] where x(t) = 2t; te [0, 1] G K 1 . Find ||a;| 

By the definition of the norm for this space, we have 



INI = 


= \\X\\2 = +Jj X 2 (t)dt, 






(7.32) 


IMI2 = 


= / x(t)x{t) dt = (2t)(2t) dt = 
Jo Jo 


= 4 / t 2 dt = 
Jo 


■«® 


1 

(7.33) 



IMI2 = 


/l 3 3 \ 4 

\Y~YJ ~ 3' 






(7.34) 


INI2 = 


= ^~ 1.1547. 






(7.35) 



I 

Example 7.6 

Consider x G L 3 [-2,3] where x(t) = 1 + 2it;t G [-2,3] G K 1 . Find ||a;||. 

By the definition of the norm we have 

/ r 3 \ 1/3 

INI = IN| 3 = + (y |l + 2t*| 3 dtj , (7.36) 

IN |a = +(/ ({l + 2it){l + 2it)Y 2 dt) , (7.37) 



iN 3/2 

lasl S 



((l + 2it)(l + 2it)) dt, (7.38) 

3 

((1 - 2ii) (1 + 2ii)) 3/2 dt, (7.39) 

-2 
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\x\\l = /_ 2 ( 1 + 4 * 2 ) 3/2 *> ( 7 - 4 °) 

kill = ( K VT+^(j+t^+^ S inh~ 1 (2t)^ , (7.41) 

W |3 = ^E + ^^l) + l( 15 4yT7 + s inh™ 1 (6))~ 214.638, 



13 4 ' 16 16 



(7.42) 
\x\\ 3 ~ 5.98737. (7.43) 



I 

Example 7. 7 

Consider x e h p [a,b] where x(t) = c; t e [a, b] € M},c € C 1 . Find ||a;||. 
Let us take the complex constant c = a + i(3, a € R , /3 € R . Then 

|c| = (« 2 + /3 2 ) 1/2 . (7.44) 



Now 



Ml = |M|p=ms(t)| p dtj , (7.45) 

.6 / \ 1/P 

\\x\\ p (a 2 + P 2 ) P/2 dtj , (7.46) 

X 1/P 



IW|p = ((a 2 +/? 2 ) p/2 / eft] , (7.47) 



I NIp = (V+/? 2 ) P/2 (6-a)) 1/P , (7.48) 

|N| P = (a 2 +/3 2 ) 1/2 (6-a) 1 /P, (7.49) 

|N| P = \c\(b-a)^. (7.50) 

Note the norm is proportional to the magnitude of the complex constant c. For finite p, it also increases 
with the extent of the domain b — a. For infinite p, it is independent of the length of the domain, and 
simply selects the value \c\. This is consistent with the norm in L^ selecting the maximum value of 
the function. 

I 



I 

Example 7.8 

Consider x e L p [0, b] where x(t) = 2t 2 ; t € [0, b] e R 1 . Find ||x| 
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Now 



Ml = IMI„= (j\x(t)\ p dtj . 



klip 
klip 
klip 

klip 
klip 



\2t 2 \ p dt 



2 p t 2p dt 



i/p 



i/;.' 



/ / 2 P i2p +i\ 
V V 2p + 1 J 

/ 2 P52p+l\ Vp 



i/p 



2p + l 



(2p+l 



i/p 



Note as p -> oo that (2p + 1) /p -> 1, and (2p + l)/p -> 2, so 



lim ||a;|| = 26 2 . 

p — >oo 



This is the maximum value of x(t) = 2£ 2 in t € [0, 6], as expected. 



(7.51) 
(7.52) 
(7.53) 
(7.54) 

(7.55) 
(7.56) 

(7.57) 



I 

Example 7.9 

Consider u € W^(G) with u(x) = 2a: 4 ; a; G [0,3] G K 1 . Find ||u||. 

Here we require u G L 2 [0,3] and du/dx G L 2 [0,3], which for our choice of u, is satisfied. The 
formula for the norm in W^O, 3] is 



ini = iMk^+y/ 8 («(*)«(*) +^^) *c, 



|«||i,2 = +Wy ((2s*)(2s*) + (8s8)(8a;3)) di, 



« 1,2 



M|i,2 +V / (4a; 8 + 64a- 6 ) d.x, 



(4x 9 64x 7 



54\/ — ~ 169.539. 

7 



(7.58) 
(7.59) 
(7.60) 
(7.61) 



J 
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I 

Example 7.10 

Consider the sequence of vectors {xn\, X(o\, . . .} G Q 3 , where Q 3 is the space of rational numbers 
over the field of rational numbers, and 

X(i) = (1,3,0) = (x (1 )i, Z(i)2, X(i) 3 ) , (7.62) 

*(2) = (i^i- 3 - ) = (^ 3 ' )' ^ 63 ) 

*(3) = (i7T' 3 '°) = (|' 3 ' )' ( 7 - 64 ) 

X W = (i^T. 3 ' ) = (|' 3 ' )» ( 7 - 65 ) 

: (7.66) 

Hn) = (r— I ,3,o), (7.67) 

V 1 -r x {n- 1)1 / 



for n > 2. Does this sequence have a limit point in Q 3 ? Is this a Cauchy sequence? 

Consider the first term only; the other two are trivial. The series has converged when the n th term 
is equal to the (n — 1) term: 

£(n-i)i = — • (7.68) 

1 + £(„_i)i 

Rearranging, it is found that 

Z(„_i)i + X(n-i)i -1 = 0. (7.69) 

Solving, one finds that 

X(n-l)l = g • ( 7J0 ) 

We find from numerical experimentation that it is the "+" root to which X\ converges: 

lim a; ( „_i)i = — - — . (7.71) 

n— >oo Z 



As n — > oo, 



S (»)-M^A 3 ' V (7.72) 



Thus, the limit point for this sequence is not in Q 3 ; hence the sequence is not convergent. Had the set 
been defined in M. 3 , it would have been convergent. 

However, the sequence is a Cauchy sequence. Consider, say e = .01. If we choose, we then find by 
numerical experimentation that N e = 4. Choosing, for example m = 5 > N e and n = 21 > N e , we get 

3(6) = (| 3 ' )' ( 7 - 73 ) 

W - &,0), (7.74, 



F(5) - ^(21)112 



( 987 

0,0 



141688 



i w : ' 



0.00696 < 0.01. (7.75) 



2 

This could be generalized for arbitrary e, so the sequence can be shown to be a Cauchy sequence. 
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I 

Example 7.11 

Does the infinite sequence of functions 

v = W(i),u 2 (t), ■ ■ • ,v n (t), ■■■} = {t(t),t(t 2 ),t(t 3 ), ■ ■ -,t(t n ), ■ • ■} , (7.76) 

converge in L,2[0, 1]? Does the sequence converge in C[0, 1]? 
First, check if the sequence is a Cauchy sequence: 

Urn | Mi) - v m (t)\\ 2 = J f (t"+i - i»+i) 2 dt = J^— - —^—^ + T^Tl = °- ( 7 - 77 ) 

As this norm approaches zero, it will be possible for any e > to find an integer N e such that 
ll^n(i) — v m (i)||2 < £• So, the sequence is a Cauchy sequence. We also have 

Km *.(.)-{?• I^' (7.78) 

The function given in Eq. (|7.78p . the "limit point" to which the sequence converges, is in L2[0, 1], which 
is sufficient condition for convergence of the sequence of functions in L,2[0, 1]. However the "limit point" 
is not a continuous function, so despite the fact that the sequence is a Cauchy sequence and elements 
of the sequence are in C[0, 1], the sequence does not converge in C[0, 1]. 

I 



I 

Example 7.12 

Analyze the sequence of functions 

v = {vr,v 2 , ...,«„,...}= {V2sin(7rt), >/2sin(27rt), . . . , V2sm(mrt), . . .} , (7.79) 

in L 2 [0,1]. 

This is simply a set of sine functions, which can be shown to form a basis; such a proof will not be 
given here. Each element of the set is orthonormal to other elements: 

/ r l 2 \ 1/2 

IM*)lla=(/ (V2sin(n7rt)) dt) =1. (7.80) 

It is also easy to show that L v n (t)v m (t) dt = 0, so the basis is orthonormal. As n — > oo, the norm of 
the basis function remains bounded, and is, in fact, unity. 

Consider the norm of the difference of the m th and n th functions: 

\\v n {t) - v m (t)\\ 2 = ( (y/2sm(mrt) -V^smimnt)) dt) = \[2. (7.81) 

This is valid for all m and n. Since we can find a value of e > which violates the conditions for a 
Cauchy sequence, this series of functions is not a Cauchy sequence. 

I 
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7.3.2 Inner product spaces 

The inner product <x,y> is, in general, a complex scalar (<x,y> G C 1 ) associated with 
two elements x and y of a normed vector space satisfying the following rules. For x, y, z G § 

and a, j3 E C, 

1. <x,x> > if x ^ 0, 

2. <x, x> = if and only if x = 0, 

3. <£, ay + /3z> = a<x, y> + /3<x, z>, a G C 1 , /3 G C 1 , and 

4. <x,y> = <y,x>, where <•> indicates the complex conjugate of the inner product. 

Inner product spaces are subspaces of linear vector spaces and are sometimes called pre- 
Hilberj^ spaces. A pre-Hilbert space is not necessarily complete, so it may or may not form 
a Banach space. 



I 

Example 7.13 

Show 

<ax,y> = a<x,y>. (7.82) 

Using the properties of the inner product and the complex conjugate we have 

<ax,y> = <y,ax>, (7.83) 

= a<y,x>, (7.84) 

= a <y,x>, (7.85) 

= a <x,y>. (7.86) 

Note that in a real vector space we have 

<x,ay> = <ax,y> = a<x,y>, and also that, (7.87) 

<x,y> = <y,x>, (7.88) 

since every scalar is equal to its complex conjugate. 



Note that some authors use <ay + f3z,x> = a<y,x> + (3<z,x> instead of Property 3 
that we have chosen. 



£ David Hubert, 1862-1943, German mathematician of great influence. 
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7.3.2.1 Hilbert space 

A Banach space (i.e. a complete normed vector space) on which an inner product is defined 
is also called a Hilbert space. While Banach spaces allow for the definition of several types 
of norms, Hilbert spaces are more restrictive: we must define the norm such that 

\\x\\ = \\x\\2 = +V<^x>- (7.89) 

As a counterexample if x E M 2 , and we take ||x|| = ||a;||3 = (|^i| 3 + l^l 3 ) 1 ^ 3 (thus x £ t\ 
which is a Banach space), we cannot find a definition of the inner product which satisfies all 
its properties. Thus, the space t\ cannot be a Hilbert space! Unless specified otherwise the 
unsubscripted norm 1 1 - 1 1 can be taken to represent the Hilbert space norm 1 1 • 1 1 2 . It is common 
for both sub-scripted and unscripted versions of the norm to appear in the literature. 
The Cauchy-Schwar^ inequality is embodied in the following: 

Theorem 

For x and y which are elements of a Hilbert space, 

1Mb IMb > \<x,y>\. (7.90) 

If y = 0, both sides are zero and the equality holds. Let us take y ^ 0. Then, we have 

\\x — ay\\\ = <x — ay,x — ay>, where a is any scalar, (7-91) 

= <x,x> — <x,ay> — <ay,x> + <ay,ay>, (7.92) 

= <x,x> — a<x,y> — a <y,x> + aa <y,y>, (7.93) 

<y,x> <x,y> ,_ ni , 

on choosing a = = , (7.94) 

<y,y> <y,y> 

<x,y> 
= <x,x> <x,y> 

<y,y> 

<x,y> <y,x><x,y> .„ „_. 

-—^-<y, x> + -^- — ^—<y, y>, 7.95) 

<y,y> <y,y> 



.„ \<x,y>\ 2 

= M ~ n 1,2 . 7-96 
\\y\\2 

II I |2 1 1 1 |2 1 ii2 11 ||2 1 , ._ |2 m c\n\ 

\\ x -uy\h \\y\\2 = WAh \\y\\2- \< x ,y>\ ■ ( 7 - 97 ) 

Since \\x — ay\\% \\yW2 > 0, 

IWI2 IMIi-|<z,3/>| 2 > 0, (7.98) 

IMI2 IMI2 > \< x ,y>\ 2 , (7-99) 

\\x\\2 \\yW2 > |<z,y>|, QED. (7.100) 



lc Karl Hermann Amandus Schwarz, 1843-1921, Silesia-born German mathematician, deeply influenced by 
Weierstrass, on the faculty at Berlin, captain of the local volunteer fire brigade, and assistant to railway 
stationmaster. 
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Note that this effectively defines the angle between two vectors. Because of the inequality, 
we have 

11x112 l|y " 2 > 1, (7.101) 



\<x,y>\ 

\<x,y>\ 
Mb 1Mb 



< 1. (7.102) 



Defining a to be the angle between the vectors x and y, we recover the familiar result from 

vector analysis 

<x,y> 
cosa = 7j— r: — . (7.103) 

I Fib 1Mb 

This reduces to the ordinary relationship we find in Euclidean geometry when x,y G M 3 . 
The Cauchy-Schwarz inequality is actually a special case of the so-called HoldeiM inequality: 

IMIpIMI? > \<x,V>\, with - + - = 1. (7.104) 

The Holder inequality reduces to the Cauchy-Schwarz inequality when p = q = 2. 
Examples of Hilbert spaces include 

• Finite dimensional vector spaces 

— x G M 3 , y G IR 3 with <x, y> = x T y = X\y\ + x 2 y 2 + ^32/3, where x = (xi, x 2 , x%) T , 
and y = (t/i, 2/2, 2/3)^- This is the ordinary dot product for three-dimensional 
Cartesian vectors. With this definition of the inner product <x,x> = \\x\\ 2 = 
x\ + x\ + x§, so the space is the Euclidean space, E 3 . The space is also £ 2 (K 3 ) or 
f 3 

— x G M. N ,y G R N with <x,y> = x T y = X\y\ + x 2 y 2 + ■ ■ ■ + XjvJ/jv, where x = 
(xi, x 2 , • • • , Xn) t ', and y = (yi, y 2 , • • • , ?/v) T - This is the ordinary dot product for 
A^-dimensional Cartesian vectors; the space is the Euclidean space, K N , or ^(IR^), 

or e$. 

— x G C N ,y G C^ with <x,y> = x T y = Xiyi + x 2 y 2 + ■ ■ ■ + x^y^, where x = 
(xi, x 2 , ■ ■ ■ , xat) t , and y = (t/i, y 2 , ■ ■ ■ , y^) 1 '• This space is also £ 2 ( < C 7V ). Note that 

* <x, £> = X1X1 + X2X2 + • • • + xnXn = |xi | 2 + |x 2 | 2 + ... + I^tvI 2 = ||x|||. 

* <x, y> = x^i + x 2 y 2 + . . . + x N y N . 

* It is easily shown that this definition guarantees \\x\\ 2 > and <x,y> = 
<y~x> . 

• Lebesgue spaces 

— x G L 2 [a, b], y G L 2 [o, 6], £ G [a, 6] G R 1 with <x,y> = f x(t)y(t) dt. 



11 Otto Holder, 1859-1937, Stuttgart-born German mathematician. 
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Figure 7.3: Venn diagram showing relationship between various classes of spaces. 



- x G h 2 [a,b], y G L 2 [a,b], t G [a,b] G R 1 with <x,y> = J a x(t)y(t) dt. 

Sobolov spaces 

- u e Wl(G),v e Wl{G),x e G eR N ,N eN,u e h 2 (G),du/dx n e L 2 {G),v e 
TL 2 {G),dv/dx n e L 2 (G) with 



<u,v> 



/ lu(x)v(x) + 22 a a 

J G \ =1 O x n <JX n 



dx. 



A Venn 12 ! diagram of some of the common spaces is shown in Fig. 17.31 



(7.105) 



7.3.2.2 Non-commutation of the inner product 

By the fourth property of inner products, we see that the inner product operation is not 
commutative in general. Specifically when the vectors are complex, <x, y> ^ <y, x>. When 
the vectors x and y are real, the inner product is real, and the inner product commutes, 
e.g. \/x G R. , y G K. ,<x,y> = <y,x>. At first glance one may wonder why one would 
define a non- commutative operation. It is done to preserve the positive definite character 
of the norm. If, for example, we had instead defined the inner product to commute for 
complex vectors, we might have taken <x,y> = x T y. Then if we had taken x = (i, l) T 
and y = (1, 1) T , we would have <x,y> = <y,x> = 1 + i. 



However, we would also have 



\x\ 



(i,l)(i,l 



\T 



0! Obviously, this would violate the property of the norm 



since we must have \\x\\l > for x ^ 0. 
1: 1John Vennl 1834-1923, English mathematician. 
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Interestingly, one can interpret the Heisenberg^j uncertainty principle to be entirely con- 
sistent with our definition of an inner product which does not commute in a complex space. 
In quantum mechanics, the superposition of physical states of a system is defined by a 
complex-valued vector field. Position is determined by application of a position operator, 
and momentum is determined by application of a momentum operator. If one wants to know 
both position and momentum, both operators are applied. However, they do not commute, 
and application of them in different orders leads to a result which varies by a factor related 
to Planck'o constant. 

Matrix multiplication is another example of an inner product that does not commute, 
in general. Such topics are considered in the more general group theory. Operators that 
commute are known as AbeliarJ 15 ! and those that do not are known as non-Abelian. 

7.3.2.3 Minkowski space 

While non-relativistic quantum mechanics, as well as classical mechanics, works well in com- 
plex Hilbert spaces, the situation becomes more difficult when one considers Einstein's theo- 
ries of special and general relativity. In those theories, which are developed to be consistent 
with experimental observations of 1) systems moving at velocities near the speed of light, 
2) systems involving vast distances and gravitation, or 3) systems involving minute length 
scales, the relevant linear vector space is known as Minkowski space. The vectors have four 
components, describing the three space-like and one time-like location of an event in space- 
time, given for example by x = (xq, xi, X2, xs) T , where xq = ct, with c as the speed of light. 
Unlike Hilbert or Banach spaces, however, norms and inner products in the sense that we 
have defined do not exist! While so-called Minkowski norms and Minkowski inner products 
are defined in Minkowski space, they are defined in such a fashion that the inner product of a 
space-time vector with itself can be negative! From the theory of special relativity, the inner 
product which renders the equations invariant under a Lorentqlj transformation (necessary 
so that the speed of light measures the same in all frames and, moreover, not the Galilean 17 ! 
transformation of Newtonian theory) is 

Obviously, this inner product can take on negative values. The theory goes on to show that 
when relativistic effects are important, ordinary concepts of Euclidean geometry become 
meaningless, and a variety of non-intuitive results can be obtained. In the Venn diagram, 
we see that Minkowski spaces certainly are not Banach, but there are also linear spaces that 
are not Minkowski, so it occupies an island in the diagram. 



13 ^ 



Werner Karl Heisenberg 1901-1976, German physicist. 



lj Max Karl Ernst L udwig Planck) 1858-1947, German physicist. 

1E Niels Henric k Abell 1802-1829, Norwegian mathematician, considered solution of quintic equations by 
elliptic functions, proved impossibility of solving quintic equations with radicals, gave first solution of an 
integral equation, famously ignored by Gauss. 

le Hendrik Antoon Lorentz, 1853-1928, Dutch physicist. 

17 after Galileo Galilei, 1564-1642, Italian polymath. 
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I 

Example 7.14 

For x and y belonging to a Hilbert space, prove the parallelogram equality: 

ll* + y|ll + ll*-y|ll = 2|Nll + 2||y||i. (7.107) 



The left side is 

<x + y,x + y> + <x - y, x - y> = (<x,x> + <x,y> + <y,x> + <y,y>) , (7.108) 

+ (<x, x> - <x, y> - <y, x> + <y, y>) , (7.109) 

= 2<x,x> + 2<y,y>, (7.110) 

= 2\\x\\l + 2\\y\\l (7.111) 



I 

Example 7.15 

For x,y e ^(R 2 ), find <x, y> if 



3)' y= {-2 



(7.112) 



T ' - n \ I ** 



The solution is 

<x,y>=x 1 y = (l 3)(" 2 J=(l)(2) + (3)(-2)=-4. (7.113) 

Note that the inner product yields a real scalar, but in contrast to the norm, it can be negative. Note 
also that the Cauchy-Schwarz inequality holds as ||x||2 IMI2 = vTOv^ ~ 8.944 > | — 4|. Also the 
Minkowski inequality holds as ||£E + 2/|| 2 = ||(3,1) T || 2 = +v / 10 < ||a;||2 + IMI2 = v / 10 + V8- 

I 



Example 7.16 

For x,y £ ^(C 2 ), find <x, y> if 



, = (-+•). r-^:"). m 

The solution is 
<x,y> =x T y = (-1-i 3 + 2i) ( 1 ~ 2 M = (-1 - i)(l - 2i) + (3 + 2z)(-2) = -9 - 3i. (7.115) 

Note that the inner product is a complex scalar which has negative components. It is easily shown that 
||x|| 2 = 3.870 and \\y\\ 2 = 3 and ||a; + y||2 = 2.4495. Also \<x,y>\ = 9.4868. The Cauchy-Schwarz 
inequality holds as (3.870)(3) = 11.61 > 9.4868. The Minkowski inequality holds as 2.4495 < 3.870+3 = 
6.870. 
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I 

Example 7.17 

For x, y £ L2[0, 1], find <x, y> if 



x(i) = 3i + 4, y (t) = -t-l. (7.116) 



The solution is 



7t 2 
<x,y> -J (3i + 4)(-i- 1) di= ( -4£ i 3 



/ (3< + 4)(- 



17 



-8.5. (7.117) 



Once more the inner product is a negative scalar. It is easily shown that ||x||2 = 5.56776 and \\yW2 = 

1.52753 and \\x + 2/H2 = 4.04145. Also \<x, y>\ = 8.5. It is easily seen that the Cauchy-Schwarz 

inequality holds as (5.56776)(1. 52753) = 8.505 > 8.5. The Minkowski inequality holds as 4.04145 < 
5.56776+1.52753 = 7.095. 

I 



I 

Example 7.18 

For x, y € ^[0, 1], find <x,y> if 



We recall that 



x(t)=it, y(t)=t + i. (7.118) 



r-i_ 
<x,y>=l x(t)y(t) dt. (7.119) 



The solution is 

-I 



<x,y> = / (-it)(t + i) dt 



1 = l__i_ 

n 2 3' 



(7.120) 



The inner product is a complex scalar. It is easily shown that ||x||2 = 0.5776 and \\y\\i = 1.1547 and 
||x+2/|| 2 = 1.6330. Also|<x,y>| = 0.601. The Cauchy-Schwarz inequality holds as (0.57735)(1. 1547) = 
0.6667 > 0.601. The Minkowski inequality holds as 1.63299 < 0.57735 + 1.1547 = 1.7321. 

I 



I 

Example 7.19 

For u, v e Wi{G)), find <u,v> if 

u(x) = X\ + X2, v(x) = —X1X2, (7.121) 

and G is the square region in the x\,X2 plane x\ £ [0, 1],X2 £ [0, 1]. 

ICC BY-NC-THJ} 29 July 2012, Sen & Powers. 



7.3. VECTOR SPACES 253 



We recall that 

f ( , , . , du dv du dv \ , . 

<„,«> = J g [u(x)v(x) + — — + — — j «fe, (7.122) 

4 



<u,v>= / ((xi+» 2 )(-xia:a) + (l)(-a!2) + (l)(-a!i)) da;i (tea = -«= -1.33333. (7.123) 

Jo ^0 3 

The inner product here is negative real scalar. It is easily shown that ||u||i,2 = 1.77951 and IHI1.2 = 

0.881917 and \\u + v\\ ia = 1.13039. Also \<u,v>\ = 1.33333. The Cauchy-Schwarz inequality holds 

as (1.77951)(0.881917) = 1.56938 > 1.33333. The Minkowski inequality holds as 1.13039 < 1.77951 + 

0.881917= 2.66143. 

I 



7.3.2.4 Orthogonality 

One of the primary advantages of working in Hilbert spaces is that the inner product allows 
one to utilize of the useful concept of orthogonality: 

• x and y are said to be orthogonal to each other if 

<x,y> = 0. (7.124) 

• In an orthogonal set of vectors {v\, v 2 ,- ■ ■} the elements of the set are all orthogonal 
to each other, so that <v n , v m > = if n ^ m. 

• If a set {(/?!, (p 2 , ■ ■ •} exists such that <(p n , <p m > = S nm , then the elements of the set are 
orthonormal. 

• A basis {v\, v 2 , ■ ■ ■ ,%} of a finite-dimensional space that is also orthogonal is an 
orthogonal basis. On dividing each vector by its norm we get 

Vn = , Vn = , (7.125) 

to give us an orthonormal basis {y?i, ip 2 , ■ ■ ■ , ^tv}- 



I 

Example 7.20 

If elements x and y of an inner product space are orthogonal to each other, prove the Pythagorean 
theorem 

IMH + llyllilHIs + vIl!- (7-126) 

The right side is 

<x + y,x + y> = <x,x> + <x, y> + <y, x> +<y, y>, (7.127) 

=0 =0 

= <x,x> + <y,y>, (7.128) 

= IMI2 + IMI2, QED. (7.129) 
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I 

Example 7.21 

Show that an orthogonal set of vectors in an inner product space is linearly independent. 

Let {^i, v 2 , ■ ■ ■ , v n , . . . , vn} be an orthogonal set of vectors. Then consider 

a\V\ + a 2 v 2 + ■ • ■ + a n v n + . . . + a N v N = 0. (7.130) 

Taking the inner product with v n , we get 

<v n , (aivi + a 2 v 2 + ■ ■ ■ + a n v n + . . . + a N v N )> = <v n ,0>, (7.131) 

ai <v n ,Vi> +q 2 <v n ,v 2 > + ■ ■ ■ + a n <v n ,Vn> + ■ ■ ■ + «jv <v n , VN> = 0, (7.132) 

oo ^o o 

a n <v n ,v n > = 0, (7.133) 

since all the other inner products are zero. Thus, a n = 0, indicating that the set {i>i, V2, ■ ■ ■ , v n , . . . , vjy} 
is linearly independent. 

I 



7.3.2.5 Gram-Schmidt procedure 

In a given inner product space, the Grarn-Schrnidl 18 \ procedure can be used to find an or- 
thonormal set using a linearly independent set of vectors. 



I 

Example 7.22 

Find an orthonormal set of vectors {tpi, if 2 , ■ ■ •} in L 2 [— 1, 1] using linear combinations of the linearly 
independent set of vectors {1, t, t 2 , i 3 , . . .} where — 1 < t < 1. 

Choose 

vi{t) = 1. (7.134) 

Now choose the second vector linearly independent of v\ as 

v 2 {t)=a + bt. (7.135) 

This should be orthogonal to vi, so that 

• l 

vi(t)v 2 (t) dt = 0, (7.136) 



i 
l 

(1) (a + bt) dt = 0, (7.137) 

=vi(t) =v 2 {t) 

1 



bt 2 

at-\ 

2 



(7.138) 



a(l-(-l)) + ^(l 2 -(-l) 2 ) = 0, (7.139) 



ls J0rgen Pedersen Gram, 1850-1916, Danish actuary and mathematician, and Erhard Schmidt, 1876-1959, 
German/Estonian-born Berlin mathematician, studied under David Hilbert, founder of modern functional 
analysis. The Gram-Schmidt procedure was actually first introduced by Laplace. 

ICC BY-JVC-MXl 29 July 2012, Sen & Powers. 



7.3. VECTOR SPACES 255 

from which 

a = 0. (7.140) 

Taking 6=1 arbitrarily, since orthogonality does not depend on the magnitude of v 2 (t), we have 

v 2 = t. (7.141) 

Choose the third vector linearly independent of v±(t) and Viit), i.e. 

v 3 (t) = a + bt + ct 2 . (7.142) 

For this to be orthogonal to V\(t) and v 2 (t), we get the conditions 

l 

(1) (a + bt + ct 2 ) dt = 0, (7.143) 

=vi(t) =t»s(t) 

1 

t (a + bt + ct 2 ) dt = 0. (7.144) 

1 ^^" , ' 

=Mt) = V3 ( t ) 

The first of these gives c = —3a. Taking a = 1 arbitrarily, we have c = —3. The second relation gives 
6 = 0. Thus 

v 3 = l- 3t 2 . (7.145) 

In this manner we can find as many orthogonal vectors as we want. We can make them orthonormal 
by dividing each by its norm, so that we have 

Vl = -±=, (7.146) 

V2 = \j\t, (7.147) 



^3 = \-(l-3t 2 ), (7.148) 



Scalar multiples of these functions, with the functions set to unity at t = 1, are the Legendre poly- 
nomials: Po(t) = 1, P\(t) = t, Pi(t) = (1/2) (3i 2 — 1) ... As studied earlier in Chapter [SJ some other 
common orthonormal sets can be formed on the foundation of several eigenfunctions to Sturm-Liouville 
differential equations. 

I 



7.3.2.6 Projection of a vector onto a new basis 

Here we consider how to project iV- dimensional vectors x, first onto general non-orthogonal 
bases of dimension M < N, and then specialize for orthogonal bases of dimension M < N. 
For ordinary vectors in Euclidean space, N and M will be integers. When M < N, we will 
usually lose information in projecting the iV- dimensional x onto a lower M-dimensional basis. 
When M = N, we will lose no information, and the projection can be better characterized 
as a new representation. While much of our discussion is most easily digested when M and 
iV take on finite values, the analysis will be easily extended to infinite dimension, which is 
appropriate for a space of vectors which are functions. 
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7.3.2.6.1 Non-orthogonal basis We are given M linearly independent non-orthogonal 

basis vectors {ui, u 2 , ■ ■ ■ , %} on which to project the TV-dimensional x, with M < N. Each 
of the M basis vectors, u m , is taken for convenience to be a vector of length N; we must 
realize that both x and u m could be functions as well, in which case saying they have length 
TV would be meaningless. 

The general task here is to find expressions for the coefficients a m , m = 1,2,.. . M, to 
best represent x in the linear combination 



aiUi + a 2 u 2 



M 
OiMU M = ^ a mU m ~ X. 
ra=l 



(7.149) 



We use the notation for an approximation, ~, because for M < N, x most likely will not 
be exactly equal to the linear combination of basis vectors. Since u G C^, we can define U 
as the N x M matrix whose M columns are populated by the M basis vectors of length N, 
Ui, «2, ■ ■ ■ , Um- We can thus rewrite Eq. (J7.149P as 



U • a ~ x. 



(7.150) 



If M = N, the approximation would become an equality; thus, we could invert Eq. (17.1500 
and find simply that a = U _1 • x. However, if M < N, U _1 does not exist, and we cannot 
use this approach to find a. We need another strategy. 

To get the values of a m in the most general of cases, we begin by taking inner products 
of Eq. (I7.149P with u x to get 



<Ui, ai«i> + <«i, a 2 u 2> 



<ui,a M u M > 



<Ui,X>. 



(7.151) 



Using the properties of an inner product and performing the procedure for all u m ,m 
1, . . . , M, we get 



a>i<ui, U\> + a 2 <ui, u 2 > 
ai<u 2 , u±> + a 2 <u 2 , u 2 > 



a M <u 1 ,u M > 

a M <u 2 ,u M > 



<Ui,X>, 

<u 2 ,x>, 



(7.152) 
(7.153) 



ai<u M , «i> + ol 2 <Um, u 2 > 



a M <UM,UM> 



<Um,X>. 



(7.154) 



Knowing x and Ui,u 2 , ■ ■ ■ ,Um, all the inner products can be determined, and Eqs. (17.1521 
17.154ft can be posed as the linear algebraic system: 



/ <Wi,Wi> <Ui,U 2 > 
<U 2l Ui> <u 2 ,u 2 > 

\<u M ,u 1 > <u M ,u 2 > 



<U 1 ,U M > \ ( «1 \ 




/ <Wi, x> \ 

<u 2 , x> 

\<U M ,X> / 



(7.155) 



T 

U -x 
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Equation (I7.155[) can also be written compactly as 

<Ui,u m >a m = <Ui,x>. (7.156) 

In either case, Cramer's rule or Gaussian elimination can be used to determine the unknown 
coefficients, a m . 

We can understand this in another way by considering an approach using Gibbs notation, 
valid when each of the M basis vectors u m G C . Note that the Gibbs notation does not 
suffice for other classes of basis vectors, e.g. when the vectors are functions, u m e L2. Operate 
on Eq. (I7.15UD with IJ T to get 

(U T -U) •« = U T -x. (7.157) 

This is the Gibbs notation equivalent of Eq. (I7.155|) . We cannot expect U _1 to always exist; 
however, as long as the M < N basis vectors are linearly independent, we can expect the 

M x M matrix ( U • Uj to exist. We can then solve for the coefficients a via 

a=(lJ T -u) -U T -x, M<N. (7.158) 

In this case, one is projecting x onto a basis of equal or lower dimension than itself, and 
we recover the M X 1 vector a. If one then operates on both sides of Eq. ( 17.158ft with the 
N x M operator U, one gets 



U-a = U- U -U U -x = x p . (7.159) 



-v— 

p 



Here we have defined the N x N projection matrix P as 

p = U-(u T -U) •U T . (7.160) 

We have also defined x p = P • x as the projection of x onto the basis U. These topics will 
be considered later in a strictly linear algebraic context in Sec. 18.91 When there are M = N 
linearly independent basis vectors, Eq. (I7.160P can be reduced to show P = I. In this case 
U _1 exists, and we get 

= 1. (7.161) 

So with M = N linearly independent basis vectors, we have U • a = x, and recover the much 
simpler 

a = U 1 x, M = N. (7.162) 
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I 

Example 7.23 

Project the vector x = I onto the non-orthogonal basis composed of U\ = I - ) , u% = I 

Here we have the length of x as N = 2, and we have M = N = 2 linearly independent basis vectors. 
When the basis vectors are combined into a set of column vectors, they form the matrix 

I 1 -1 ) 

Because we have a sufficient number of basis vectors to span the space, to get a, we can simply apply 
Eq. (|7.162|) to get 

a = U _1 -x, (7.164) 

I ^)-(-s)' (7 - 166) 

I)- (7-167) 



Thus 



x = a lUl + a 2 u 2 = 1 M J +4(_ 1 1 J = ( _ 6 3 ) . (7.168) 



The projection matrix P = I, and x p = x. Thus, the projection is actually a representation, with no 
lost information. 

I 



I 

Example 7.24 

Project the vector x = I I on the basis composed of u\ 



3 7 ^~r-— - - y j 

Here we have a vector x with N = 2 and an M = 1 linearly independent basis vector which, when 
cast into columns, forms 

u =(?y ( 7 - i6 °) 

This vector does not span the space, so to get the projection, we must use the more general Eq. (J7.158I) . 
which reduces to 

( \ i 

(2J0- (i) (2^1)- (_ 6 3 ) = (SJ-'CQ) = (|). (7.170) 



V U v) 



u' 



So the projection is 

x p = ai«i = (§)l J I = 1 I I. (7.171) 



2 n / 18 



o 
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Note that the projection is not obtained by simply setting a 2 = from the previous example. This is 
because the component of x aligned with u 2 itself has a projection onto u\. Had u\ been orthogonal 
to U2, one could have obtained the projection onto u\ by setting a 2 = 0. 
The projection matrix is 




It is easily verified that x p 



4 2 

I I 

5 5 



(7.172) 



I 

Example 7.25 

Project the function x(t) 
functions u\ =t, u% = sin(4t). 



t 3 , t G [0, 1] onto the space spanned by the non-orthogonal basis 



This is an unusual projection. The M = 2 basis functions are not orthogonal. In fact they bear no 
clear relation to each other. The success in finding approximations to the original function which are 
accurate depends on how well the chosen basis functions approximate the original function. 

The appropriateness of the basis functions notwithstanding, it is not difficult to calculate the 
projection. Equation (|7.155j) reduces to 

J^(t)(t) dt J* (t) sin At dt 
. Jq (sin At) (t) dt fn sin 2 At dt 

Evaluating the integrals gives 



Oil 



dt 



fo(W 3 
jhsmAt)(t 3 ) dt 



(7.173) 



0.333333 0.11611l\ 
0.116111 0.438165 ) 



0.2 
-0.0220311 



Inverting and solving gives 



0.680311 \ 
-0.230558 ) 



(7.174) 



(7.175) 



So our projection of x(t) = t 3 onto the basis functions yields the approximation x p (t): 

x(t) = t 3 ~ x p (t) = aiui + a 2 u 2 = 0.680311* - 0.230558 sin4i. (7.176) 

Figure [7T4l shows the original function and its two-term approximation. It seems the approximation is 
not bad; however, there is no clear path to improvement by adding more basis functions. So one might 
imagine in a very specialized problem that the ability to project onto an unusual basis could be useful. 
But in general this is not the approach taken. 

I 



I 

Example 7.26 

Project the function x = e , t € [0, 1] onto the space spanned by the functions u m = t T} 
1,...,M, for M = A. 
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x = t 3 



x p = 0.68 1- 0.23 sin 4t 



Figure 7.4: Projection of x(t) = t 3 onto a two-term non-orthogonal basis composed of 
functions u\ = t, u 2 = sin At. 



Similar to the previous example, the basis functions are non-orthogonal. Unlike the previous 
example, there is a clear way to improve the approximation by increasing M . For M = 4, Eq. (|7.155|) 
reduces to 



/J^l)(l)dt fim)dt /o(l)(t 2 ) /o(l)(t 3 )\ 

sim) dt simt) * sim?) /owe* 3 ) 

\ si wo-) * si(t 3 mdt si(t 3 )(t 2 ) si(t 3 )(t 3 )/ 

Evaluating the integrals, this becomes 




(Sim*) 

/o(*)(e*) 

V/ 1 (* 3 )(e t ; 



di\ 
dt 
dt 
dt) 



(7.177) 





Solving for a m , and composing the approximation gives 

x p (t) = 0.999060 + 1.01830* + 0.421246t 2 + 0.278625£ 3 . 

We can compare this to xx{t), the four-term Taylor series approximation of e* about t = 0: 

t 2 t 3 
x T {t) = l + i+- + -~e*, 
2 b 

= 1.00000 + l.OOOOOi - 0.500000* 2 + 0.166667* 3 . 



(7.178) 

(7.179) 

(7.180) 
(7.181) 



Obviously, the Taylor series approximation is very close to the M = 4 projection. The Taylor approxi- 
mation, xx(t), gains accuracy as t — > 0, while our x p (t) is better suited to the entire domain t G [0, 1]. 
We can expect as M — > oo for the value of each a m to approach those given by the independent Taylor 
series approximation. Figure 1731 shows the original function against its M = 1,2, 3, 4-term approxima- 
tions, as well as the error. Clearly the approximation improves as M increases; for M = 4, the graphs 
of the original function and its approximation are indistinguishable at this scale. 

Also we note that the so-called root-mean-square (rms) error, E%, is lower for our approximation 
relative to the Taylor series approximation about t = 0. We define rms errors, E\, E% ' , in terms of a 
norm, for both our projection and the Taylor approximation, respectively, and find 



El 



\\ Xp (t) - x(t)\y 



(x p (t) -e*) 2 dt 



0.000331, 



(7.182) 
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0.2 0.4 0.6 0.8 1.0 





0.2 0.4 0.6 0.8 1.0 



0.2 0.4 0.6 0.8 1.0 







0.0000 
-0.0004 



Figure 7.5: The original function x(t) — e f , t G [0, 1], its projection onto various polynomial 
basis functions x(t) ~ x p {t) = J2 m =i a mt m ~ 1 , and the error, x — x p , for M = 1,2, 3, 4. 



El = \\x T (t)-x{t)\\, 



(x T (t) 



hi I 



0.016827. 



(7.183) 



Our M = 4 approximation is better, when averaged over the entire domain, than the M = 4 Taylor 
series approximation. For larger M, the differences become more dramatic. For example, for M = 10, 



we find El = 5.39 x 10" 13 and Ej = 6.58 x 10" 



7.3.2.6.2 Orthogonal basis The process is simpler if the basis vectors are orthogonal. 
If orthogonal, 



and substituting this into Eq. ([7 



<Ui,u m > = 0, 
1551) . we get 



/ <ui, u\> 

<u 2 ,u 2 > 



\ 












i ^ m, 



\ ( ai \ I <ui,x> ^ 

«2 



. . <u M ,u M >J \a M J 



<u 2 ,x> 

\<U AI ,X> ) 



Equation (I7.185[) can be solved directly for the coefficients: 



a, 



So, if the basis vectors are orthogonal, we can write Eq. (I7.149P as 

<u M ,x> 



<Ui,X> <u 2 ,x> 

ui H u 2 



<U\, U\> 



<u 2 ,u 2 > 

M 

E 



UM 






<u M , u M > 

M 

l rn — - > o/, m u m — X 
l <u m ,u m > 



II, 



(7.184) 



(7.185) 



(7.186) 

(7.187) 
(7.188) 
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If we use an orthonormal basis {(fi, (f2, ■ ■ ■ , V?m}, then the projection is even more efficient. 
We get the generalization of Eq. (15.2220 : 

a m = <ip m ,x>, (7.189) 

which yields 

M 

^2<ip m ,x> tp m ~ x. (7.190) 

ra=l 

In all cases, if M = N, we can replace the "~" by an "=" , and the approximation becomes 
in fact a representation. 

Similar expansions apply to vectors in infinite-dimensional spaces, except that one must 
be careful that the orthonormal set is complete. Only then is there any guarantee that any 
vector can be represented as linear combinations of this orthonormal set. If {tpi, tp2, • • •} is a 
complete orthonormal set of vectors in some domain Q, then any vector x can be represented 
as 



Y] a n <Pn, (7.191) 



x 

n=l 

where 

a n = <tp n ,x>. (7.192) 

This is a Fourier series representation, as previously studied in Chapter [5l and the values of 
a n are the Fourier coefficients. It is a representation and not just a projection because the 
summation runs to infinity. 



I 

Example 7.27 

Expand the top hat function x(t) = H(t — 1/4) — H(t — 3/4) in a Fourier sine series in the domain 

te[o,i]. 

Here, the function x(t) is discontinuous at t = 1/4 and t = 3/4. While x{t) is not a member of 
C[0, 1], it is a member of IL,2[0, 1]. Here we will see that the Fourier sine series projection, composed of 
functions which are continuous in [0, 1], converges to the discontinuous function x(t). 

Building on previous work, we know from Eq. (|5.54p that the functions 

ip n (t) = v / 2sin(n7ri), n=l,...,oo, (7.193) 

form an orthonormal set for t G [0, 1]. We then find for the Fourier coefficients 

a n = V~2 / (h (t ) -H (t- -) ) sin(mrt) dt = V2 / sin(mrt) dt. (7.194) 

Performing the integration for the first nine terms, we find 

a n = -(l,0,-i 0,-i 0,i,0,i,...y (7.195) 

7T \ 3 5 7 9/ 
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9 term series 




36 term series 
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Figure 7.6: Expansion of top hat function x(t) = H(t — 1/4) — H(t — 3/4) in terms of 
sinusoidal basis functions for two levels of approximation, N = 9, N = 36 along with a plot 
of how the error converges as the number of terms increases. 



Forming an approximation from these nine terms, we find 



H\t-- 



H\t-- 



2V2 ( sin(37ri) sin(57ri) sin(77rf) sin(97rf) 
sin(7ri) 1 h 



7 



9 



Generalizing, we get 



Hit-- 



H\t-- 



IT £-*t 



("I) 



fe-1 



fc=l 



sin((4fc - 3)wt) sin((4fc - l)irt) 



4k 



4k- 1 



(7.196) 



(7.197) 



The discontinuous function x(t), two continuous approximations to it, and a plot revealing how the 
error decreases as the number of terms in the approximation increase are shown in Fig. 17.61 Note that as 
more terms are added, the approximation gets better at most points. But there is always a persistently 
large error at the discontinuities t = 1/4, t = 3/4. We say this function is convergent in L.2[0, 1], but is 
not convergent in LqoJO, 1]. This simply says that the rms error norm converges, while the maximum 
error norm does not. This is an example of the well-known Gibbs phenomenon. Convergence in L2[0, 1] 
is shown in Fig. 17.61 The achieved convergence rate is ||x p (t) — £(£)||2 ~ 0.474088./V -0 ' 512 . This suggests 
that 

lim \\x p (t) - X (t)\\ 2 ~-L, (7.198) 



JV- 



where N is the number of terms retained in the projection. 



J 



The previous example showed one could use continuous functions to approximate a dis- 
continuous function. The converse is also true: discontinuous functions can be used to 
approximate continuous functions. 



I 

Example 7.28 

Show that the functions <pi(t), ^2(^)1 ■ • ■ ' VJv(0 are orthonormal in L.2(0, 1], where 



<Pn{t) 



N ^=i <t< — 

1 N — N ' 

0, otherwise. 



(7.199) 



Expand x(t) = t 2 in terms of these functions, and find the error for a finite N . 
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We note that the basis functions are a set of "top hat" functions whose amplitude increases and 
width decreases as N increases. For fixed N, the basis functions are a series of top hats that fills the 
domain [0, 1]. The area enclosed by a single basis function is 1/yN. If n ^ m, the inner product 

<tp„,tp m >= <Pn(t)ip m (t) dt = 0, (7.200) 

Jo 

because the integrand is zero everywhere. If n = m, the inner product is 

1 ti -1 _n_ -, 

/ <Pn(i)tp n (t) dt = / (0)(0)di+ I VN\fN dt + / (0)(0) dt, (7.201) 

Jo Jo J^i •/# 

/ n n — l\ ,_ 

= n (n-^t)> (7 ' 202) 

= 1. (7.203) 

So, {(^i, </32, • ■ • , <Pn} is an orthonormal set. We can expand the function f(t) = t 2 in the form 

JV 
t 2 = Y j anVn- (7.204) 

n=l 

Taking the inner product of both sides with tp m (t) , we get 

/ Vm{t) 
JO 

/ <Pm{t) 

Jo 



2 dt -- 


,1 JV 

= / tPm(t)y~]a n <p n (i) dt, 

Jo i 

u n— 1 


2 dt -- 


iV ,.1 

= ^Z Q " / Pn»(*Vn(*) dt, 
n-1 ■'0 



(7.205) 
(7.206) 



,i iv 

/ ip m (t)t 2 dt = y^a n S, 
Jo i 

u n— 1 



(7.207) 



l 

2 



p m (t)r dt = a m , (7.208) 

l 
95 n (t)t 2 dt = a n . (7.209) 



Thus, 



Thus, 



a„ = 0+/ t 2 VNdt + 0. (7.210) 



'3n 2 -3n+l). (7.211) 



3iV 5 / 2 



The functions t 2 and the partial sums /jv(t) = 2 n =i a n t fin{t) for N = 5 and ./V = 10 are shown in 
Fig. 17.71 Detailed analysis not shown here reveals the L2 error for the partial sums can be calculated 
as A at, where 



A?v = ||/(t)-/jv(*)||l, (7.212) 

= / ft 2 - J2 <*n<Pn(t) J dt, (7.213) 
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x(t) = f 




x(t) = t ' 




Figure 7.7: Expansion of x(t) = t 2 in terms of "top hat" basis functions for two levels of 
approximation, N = 5, N = 10. 



1 



^iV 



9N 2 

1 
3iV 



5N' 



57V 2 ' 



(7.214) 
(7.215) 



which vanishes as N — > oo at a rate of convergence proportional to 1/N. 



I 

Example 7.29 

Demonstrate the Fourier sine series for x(t) = It converges at a rate proportional to 1/%/TV, where 
N is the number of terms used to approximate x(t), in L2[0, 1]. 



Consider the sequence of functions 

ip n (t) = |v / 2sin(7ri), v/2sin(27rt), . . . , V2sm(nirt), . . .] 



(7.216) 



It is easy to show linear independence for these functions. They are orthonormal in the Hilbert space 

L 2 [0,l],e.g. 



<V2,V3> 



<V?3,¥>3> 



V / 2sin(27rt)^ (V2sm(3irt)) dt = 0, 
/ (V2sin(3irtj) (V2sm(3irt)\ dt = 1. 



(7.217) 



(7.218) 



Note that while the basis functions evaluate to at both t = and t = 1, that the function itself 
only has value at t = 0. We must tolerate a large error at t = 1, but hope that this error is confined 
to an ever collapsing neighborhood around t = 1 as more terms are included in the approximation. 

The Fourier coefficients are 



1,1 2\/2(— l) n+1 
a n = <2t,ip n (t)> -- I (2t)V2sm(mrt) dt = . 



(7.219) 
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l|x(t)-x D (t)|| 



P v '"2 



X(t)-X (t)|| ~ 0.841 N" 0481 




7 10 15 20 N 



Figure 7.8: Behavior of the error norm of the Fourier sine series approximation to x(t) 
on t G [0, 1] with the number TV of terms included in the series. 



2/ 



The approximation then is 



The norm of the error is then 



N 



4(-l)"+ 1 . 
Cp(cJ = > sin(n7ir). 



n=l 



\\x(t) - x p (t)\\, 



N 



r(-(i 



4 (_l)«+i 



sin(n7ri) di. 



(7.220) 



(7.221) 



This is difficult to evaluate analytically. It is straightforward to examine this with symbolic calculational 
software. 

A plot of the norm of the error as a function of the number of terms in the approximation, N, 
is given in the log-log plot of Fig. 17.81 A weighted least squares curve fit, with a weighting factor 
proportional to iV 2 so that priority is given to data as N — > oo, shows that the function 



\\x(t) -x p (t)\\ 2 ~ 0.841 N' 



-0.481 



(7.222) 



approximates the convergence performance well. In the log-log plot the exponent on N is the slope. It 
appears from the graph that the slope may be approaching a limit, in which it is likely that 



\\x(t) - x p (t)\\. 



1 



(7.223) 



This indicates convergence of this series. Note that the series converges even though the norm of the 
n th basis function does not approach zero as n — > oo: 



lim \\ip n \L = 1, 



(7.224) 



since the basis functions are orthonormal. Also note that the behavior of the norm of the final term in 
the series, 



l|awpjv(*)|| 
does not tell us how the series actually converges 



\ 



j; (^f^,^ 



en 



2^2 



(7.225) 
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I 

Example 7.30 

Show the Fourier sine series for x(t) = t — t 2 converges at a rate proportional to 1/N 5 ' 2 , where N 
is the number of terms used to approximate x{t), in L.2[0, 1]. 

Again, consider the sequence of functions 

tp n (t) = lV2sm(irt), y/2sm(2nt), . . . , V2sm(mrt), . . . j . (7.226) 

which are as before, linearly independent and moreover, orthonormal. Note that in this case, as opposed 
to the previous example, both the basis functions and the function to be approximated vanish identically 
at both t = and t = 1. Consequently, there will be no error in the approximation at either end point. 
The Fourier coefficients are 

2\/2(l + (-l)" +1 ) 
a n = l 3 \ ' '-■ (7.227) 

Note that a n = for even values of n. Taking this into account and retaining only the necessary basis 
functions, we can write the Fourier sine series as 

N 4a/2 

x(t) = t(l - t) ~ x p (t) = J2 (2m - 1 )3 7T 3 sin ^ 2m " V 1 *). (7.228) 

m=l *> ' 

The norm of the error is then 



\\x{t) - x p {t)\\^ 



i'(' (1 -<>-(£ 



\ \ 2 

■ sin((2m - l)7rt) ] ] dt. (7.229) 



(2m- 1) 3 tt 3 



Again this is difficult to address analytically, but symbolic computation allows computation of the error 
norm as a function of N . 

A plot of the norm of the error as a function of the number of terms in the approximation, N, 
is given in the log-log plot of Fig. 17.91 A weighted least squares curve fit, with a weighting factor 
proportional to N 2 so that priority is given to data as N — > oo, shows that the function 

\\x(t) - x p {t)\\ 2 - 0.00995 AT" 2 - 492 , (7.230) 

approximates the convergence performance well. Thus, we might suspect that 

li^J\ x{ t)-x p (t)\\2-J^j2- (7-231) 

Note that the convergence is much more rapid than in the previous example! This can be critically 
important in numerical calculations and demonstrates that a judicious selection of basis functions can 
have fruitful consequences. 
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|x(t)-x p (t)|| 2 



icr 



||x(t)-x (t)|| ~ 0.00994 N" 2492 



10- 



10 15 20 



Figure 7.9: Behavior of the error norm of the Fourier sine series approximation to x(t) = 
t(l — t) on t G [0, 1] with the number TV of terms included in the series. 

7.3.2.7 Parseval's equation, convergence, and completeness 

We consider Parseval'so equation and associated issues here. For a basis to be complete, we require 
that the norm of the difference of the series representation of all functions and the functions themselves 
converge to zero in L2 as the number of terms in the series approaches infinity. For an orthonormal 
basis ip n {t), this is 

JV 

;(i)-^Q„(^„(t) =0. (7.232) 



lim 



Now for the orthonormal basis, we can show this reduces to a particularly simple form. Consider for 
instance the error for a one-term Fourier expansion 



\x-atp\\ 2 



<x — atp,x — aip>, 

<x, x> — <x, a<p> — <aip, x> + <a<p, a<p>, 
MI2 — ce<x, <p> — a<ip 1 x> + aa<ip 1 <p> 7 



a<(p, x> — a<ip, x> + aa<(p, ip>, 

aa — aa + cfa(l), 

aa, 



■'•II 



(7.233) 

(7.234) 
(7.235) 
(7.236) 
(7.237) 
(7.238) 
(7.239) 



Here we have used the definition of the Fourier coefficient <ip, x> = a, and orthonormality <ip, (p> = 1. 
This is easily extended to multi-term expansions to give 



TV 



:(*) - / y a n <Pn(t) 



N 



IkWIIi-EK 



So convergence, and thus completeness of the basis, is equivalent to requiring that 

N 

\\x(t)\\l= lim Y, 

N—>oo *■ — ^ 



|a n | 2 , 



(7.240) 



(7.241) 



n=l 



1£ Marc-Antoine Parseval des Chenes, 1755-1835, French mathematician. 
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for all functions x(t). Note that this requirement is stronger than just requiring that the last Fourier 
coefficient vanish for large N; also note that it does not address the important question of the rate of 
convergence, which can be different for different functions x(t), for the same basis. 

7.3.3 Reciprocal bases 

Let {tii, • • • ,1{jv} be a basis of a finite-dimensional inner product space. Also let {u^, • • • , Ujf\ be 
elements of the same space such that 

<u„,u*> = 6 nm - (7.242) 

Then {uf, ■ • • , w^} is called the reciprocal (or dual) basis of {ui, • ■ • , Ujv}. Of course an orthonormal 
basis is its own reciprocal. Since {ui, ■ • ■ , un} is a basis, we can write any vector x as 



N 
X = 

m—1 



J2 a ^ u ™- ( 7 - 243 ) 

m— 1 

Taking the inner product of both sides with w^, we get 

N 

<u%,x> = <u%,^2°tmU m >, (7.244) 

m—1 

N 

= ^2<u%,a m u m >, (7.245) 

m—1 

N 

= J2a m <u*,u m >, (7.246) 



m—1 

N 



Y^ OimSnm, (7.247) 



m—1 



a™, (7.248) 



so that 



N 
X = 

n=l 



j^ <«*«>«„. (7.249) 



The transformation of the representation of a vector x from a basis to a dual basis is a type of alias 
transformation. 



r 



Example 7.31 

A vector v resides in M?. Its representation in Cartesian coordinates is v = £ = ( ^ ) . The vectors 
2\ , (\ 



U\ = I ~ ) and «2 = I n I span the space K 2 and thus can be used as a basis on which to represent v. 

Find the reciprocal basis w;p,u^, and use Eq. (|7.249j) to represent v in terms of both the basis Ui,U2 
and then the reciprocal basis uf^,u^. 

We adopt the dot product as our inner product. Let's get a\,a2- To do this we first need the 
reciprocal basis vectors which are defined by the inner product: 

<u n ,u%> = 6 nm . (7.250) 
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We take 



«21 / \ «22 



Expanding Eq. (|7.250[) . we get, 



Solving, we get 



<«i,«f > = uluf = (2, 0) ■ ( an ) (2)ou + (0)oai = 1, (7.252) 

<ui,uf>=u?u£ = (2,0)- f ° 12 J = (2)oia + (0)a 22 = 0, (7.253) 

a 22 

<u 2 ,uf>=U2uf = (l,3)-(° u ) = (l)an + (3)oai = 0, (7.254) 

0-21 

<u 2 ,uf> =u^u^ = (1,3) ■( ai2 ) = (l)oia + (3)oaa = 1. (7.255) 

a 22 



11 1 , 

On = -, 021 = -;:, ai2 = 0, a 22 = -, (7.256) 

2 b 6 



so substituting into Eq. (|7.251[) . we get expressions for the reciprocal base vectors: 



± \ * 1 



1 I "2-1 

6 / V 3 



We can now get the coefficients af. 



(7.257) 



ai <«?.£>= I 2.-6 • 5 =2-6 = 3' (7 ' 258) 



r - / 1 M / 3\ 3 5 2 



«2 = <«£,£> = 0,- ■ , =0 + - = -. (7.259) 



So on the new basis, v can be represented as 

v = - «i + - « 2 . (7.260) 

The representation is shown geometrically in Fig. 17.101 Note that u^ is orthogonal to u 2 and that u 2 
is orthogonal to u\. Further since ||mi|| 2 > 1, ||u 2 || 2 > 1, we get \\vJy ||a < 1 and ||w^|| 2 < 1 in order to 
have <Ui,u^> = Sij. 

In a similar manner it is easily shown that v can be represented in terms of the reciprocal basis as 

N 



where 



For this problem, this yields 



v = J2 Pnv£ = ftuf + fauf, (7.261) 



I3 n = <u n ,^>. (7.262) 



v = 6uf + 18uf . (7.263) 



Thus, we see for the non-orthogonal basis that two natural representations of the same vector exist. 
One of these is actually a covariant representation; the other is contravariant. 
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% 



Figure 7.10: Representation of a vector a; on a non-orthogonal contravariant basis U\, «2 
and its reciprocal covariant basis wf , wf. 

Let us show this is consistent with the earlier described notions using "upstairs-downstairs" notation 
of Sec. 11.31 Note that our non-orthogonal coordinate system is a transformation of the form 






(7.264) 



where £ l is the Cartesian representation, and x 3 is the contravariant representation in the transformed 
system. In Gibbs form, this is 

£ = J x. (7.265) 

Inverting, we also have 

x = J _1 -£. (7.266) 

For this problem, we have 



DC 
dxi 



2 1 
3 



Wi u 2 



(7.267) 



so that 

Note that the unit vector in the transformed space 

I 2 ) = ( ) ' 



(7.268) 



(7.269) 



has representation in Cartesian space of (2,0) , and the other unit vector in the transformed space 

,1\ /Q 



(7.270) 
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has representation in Cartesian space of (1,3) T . 
Now the metric tensor is 



9ij 



G 



2 
1 3 



2 1 
3 



4 2 
2 10 



(7.271) 



The Cartesian vector £ = (3, 5) T , has a contravariant representation in the transformed space of 



x = J" 1 ■£ 



2 1 
3 



(7.272) 



This is consistent with our earlier finding. 

This vector has a covariant representation as well by the formula 



Xi — 9ij% 



4 2 
2 10 



IS 



Once again, this is consistent with our earlier finding 

Note further that 

/ I -I 

T-l — I 2 6 

1 o i 



(7.273) 



(7.274) 



The rows of this matrix describe the reciprocal basis vectors, and is also consistent with our earlier 
finding. So if we think of the columns of any matrix as forming a basis, the rows of the inverse of that 
matrix form the reciprocal basis: 



R 



( 



"1 



UN 



- V 



(7.275) 



Lastly note that detJ = 6, so the transformation is orientation-preserving, but not volume- 
preserving. A unit volume element in £-space is larger than one in x-space. Moreover the mapping 
£ = J ■ x can be shown to involve both stretching and rotation. 

I 



I 

Example 7.32 

For the previous example problem, consider the tensor A, whose representation in the Cartesian 
space is 

3 4\ 

(7.276) 



v l 2 
Demonstrate the invariance of the scalar £ ■ A ■ £ in the non-Cartesian space. 

First, in the Cartesian space we have 

<--A. t - ( . 5,. (j ;w;) -us. 



(7.277) 
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Now A has a different representation, A', in the transformed coordinate system via the definition 
of a tensor, Eq. (|1.181[) . which for this linear alias transformation, reduces tor°l 



So 



A' 



A' = J 1 A J. 



3 4 
1 2 



/ 1 


1 \ 


1 ? 


6 1 


lo 


u 






J 


- 1 


/ 8 


19 \ 


3 





\3 


3 / 



2 1 
3 



We also see by inversion that 

A = J A' J 1 . 

Since £ = J • x, our tensor invariant becomes in the transformed space 
£ T ■ A ■ £ 



(J ■ x) T ■ (J • A' ■ J" 1 ) ■ (J ■ x), 
x T • J T ■ J -A' ■ x, 



G 

= x T G -A'-x, 

covariant x 

l 3 3^ ^ 2 10 
/ 8 19 

= (6_J£) -(^1 1 

covariant x s v ' 

A' 

= 152. 
Note that x T • G gives the covariant representation of x. 



8 19 



3 3 



contravariant x 



7.278) 

7.279) 

7.280) 
7.281) 

7.282) 

7.283) 
7.284) 

7.285) 

7.286) 
7.287) 



I 

Example 7.33 

Given a space spanned by the functions U\ = 1, Ui = t, u^ = t 2 , for t € [0, 1] find a reciprocal basis 
Ui~, U2, u§ within this space. 



We insist that 



<u n ,u^> 



I U n (t)u*(t)dt = dr, 
JO 



(7.289) 



20 If J had been a rotation matrix Q, for which Q T = Q 1 and det Q = 1, then A' = Q T • A • Q from 
Eq. (J6.80JI . Here our linear transformation has both stretching and rotation associated with it. 
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If we assume that 

uf = ai +a 2 t + a 3 t 2 , (7.290) 

u« = b 1 +b 2 t + b 3 t 2 , (7.291) 

itf = Cl + c 2 t + c 3 t 2 , (7.292) 

and substitute directly into Eq. (|7.289j) . it is easy to find that 

uf = g_36i + 30i 2 , (7.293) 

uf = _36+192i- 180t 2 , (7.294) 

uf = 30- 180i+180i 2 . (7.295) 



7.4 Operators 

• For two sets X and Y, an operator (or mapping, or transformation) f is a rule that 

associates every a; G X with an image y G Y. We can write / : X — > Y, X — > Y or 
ihj/. X is the domain of the operator, and Y is the range. 

• If every element of Y is not necessarily an image, then X is mapped mto Y; this map 
is called an injection. 

• If, on the other hand, every element of Y is an image of some element of X, then X is 
mapped onto Y and the map is a surjection. 

• If, Vx G X there is a unique y G Y, and for every j/6¥ there is a unique x G X, the 
operator is one-to-one or invertible; it is a bijection. 

• f and g are inverses of each other, when X — > Y and Y — > X. 

• / : X — > Y is continuous at xq G X if, for every e > 0, there is a 5 > 0, such that 
| |/(;r) — /(^o)|| < e V x satisfying ||x — #o|| < £■ 

• If for every bounded sequence x n in a Hilbert space the sequence f(x n ) contains a 
convergent subsequence, then / is said to be compact. 

A Venn diagram showing various classes of operators is given in Fig. 17. Ill Examples of 
continuous operators are: 

1. (xi, x 2 , ■ ■ ■ , x N ) ^ y, where y = f(xi, x 2 , ■■■, x N ). 

2. / \—> g, where g = df/dt. 
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Domain 



Injection: Inverse may not exist 



Range 




Surjection: Inverse not always unique 




Bijection (one-to-one): Inverse always exists 




Figure 7.11: Venn diagram showing classes of operators. 

3- / l_ ^ 9, where g(t) = J K(s,t)f(s) ds. K(s,t) is called the kernel of the integral 
transformation. If J J \K(s,t)\ 2 ds dt is finite, then / belongs to L 2 if g does. 

4. (xi,x 2 , ■ ■ ■ , xm ) t i_ ^ (Hi, V2, ■ ■ ■ , UnY \ where y = Ax with y, A, and x being N x 1, 
N x M, and M x 1 matrices, respectively (yNxi = AAr X M^Afxi), and the usual matrix 
multiplication is assumed. Here A is a left operator, and is the most common type of 
matrix operator. 

5. (xi,X2, • • • ,xn) i— > (yi,V2, ■ ■ ■ ,Vm), where y = xA with y, x, and A being 1 x M, 
1 x N, and N x M matrices, respectively (yi X M = £ixjvAjvxm)> an d the usual matrix 
multiplication is assumed. Here A is a right operator. 



7.4.1 Linear operators 

• A linear operator L is one that satisfies 

L(x + y) = Lx + Ly, 
~L(ax) = oiLx. 

• An operator L is bounded if Vx e X3 a constant c such that 

||Lx|| < c\\x\\. 
A derivative is an example of an unbounded linear operator. 



(7.296) 
(7.297) 



(7.298) 
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• A special operator is the identity I, which is defined by Ix = x. 

• The null space or kernel of an operator L is the set of all x such that Lx = 0. The null 
space is a vector space. 

• The norm of an operator L can be defined as 

|IL|| = sup-^rrt - (7.299) 

x^O I Ml 

• An operator L is 

positive definite if <Lx,x> > 0, 
positive semi-definite if <Lx,x> > 0, 
negative definite if <Lx,x> < 0, 
negative semi- definite if <Lx,x> < 0, 

Vx^O. 

• For a matrix A, C m — > C^, the spectral norm ||A|| 2 is defined as 

||A|| 2 = sup ^K (7.300) 

x^o I Fib 



This can be shown to reduce to 



y/ K rnax, (7.301) 



— T 

where K max is the largest eigenvalue of the matrix A • A. It will soon be shown in 
Sec. 17.4.41 that because A • A is symmetric, that all of its eigenvalues are guaranteed 
real. Moreover, it can be shown that they are also all greater than or equal to zero. 
Hence, the definition will satisfy all properties of the norm. This holds only for Hilbert 
spaces and not for arbitrary Banach spaces. There are also other valid definitions of 
norms for matrix operators. For example, the p-norm of a matrix A is 

II Ar 

||A|| p = sup " [[ " p . (7.302) 

x^o I Flip 

7.4.2 Adjoint operators 

The operator L* is the adjoint of the operator L, if 

<Lx, y> = <x, L*y>. (7.303) 

If L* = L, the operator is self-adjoint. 
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I 

Example 7.34 

Find the adjoint of the real matrix A : M 2 — > R 2 , where 

an 012 

0-21 0,22 

We assume an, ai2, 021, 022 are known constants. 
Let the adjoint of A be 



Here the starred quantities are to be determined. We also have for x and y: 



Xi 

X2 

Vi 

2/2 



We take Eq. (|7.303j) and expand: 



T / \ / \ T 

in a.12 \ [ Xi \ \ I yi\ I x\ \ 11 a n a 12 \ / y x 



021 a 2 2 J \X 2 J J \V2 J \X2 J \V a 21 a 22/ \V2 

T 

a 11 x 1 +a 12 x 2 \ ( Vi \ _ ( x i\ lol 1 yi + al 2 y 2 



'ii 
* 
'12 

'21 

'22 



Thus, 



Oil 0,21, 
Ol2 0-22 
T 



(7.304) 



(7.305) 

(7.306) 
(7.307) 



<A.T,y> = <x,A*y>, (7.308) 

(7.309) 

(7.310) 



0,21X1+0,22X2/ \y2 / \x 2 J \a 2 iyi + a 22 y 2 

( ail x 1 +a 12 x 2 a2i^i+a 2 2a ; 2)f 2/1 ") = (*i x 2 ) ( a \ lVl + ^ ), (7.311) 

\2/2/ \o 21 yi+ a 22 y 2 J 

{anxx + ai 2 x 2 )yi + (a 2 ixi + a 22 x 2 )y2 = xi(a* 11 y 1 + a* 12 y 2 ) + x 2 {a* 2l yi + a 22 y 2 ). (7.312) 

Rearrange and get 

(an - a n )a:i?/i + (a 2 i - a* 12 )xiy 2 + {a 12 - a 21 )x 2 yi + (a 22 - a 22 )x 2 y 2 = 0. (7.313) 

Since this must hold for any Xi,X2,yi,V2j w e have 

an, (7.314) 

021, (7.315) 

ai2, (7.316) 

a 22 . (7.317) 



(7.318) 
(7.319) 



iN 



Thus, a symmetric matrix is self-adjoint. This result is easily extended to complex matrices A : C 

A* = A T . (7.320) 
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I 

Example 7.35 

Find the adjoint of the differential operator L : X — > X, where 



dP_ d 

ds 2 ds 

and X is the subspace of La[0, 1] with x(0) = x{\) = if x £ X 



L = — + -. ( 7 - 321 ) 



Using integration by parts on the inner product 
<Lx,y> = [ (x"(s)+x'(s))y(s) ds, (7.322) 

= / x"(s)y(s) ds + x'{s)y(s)ds, (7.323) 

Jo Jo 

x'(l)y(l) - x' ((%(0) - J x'(s)y'(s) ds) + x(l) y(l) - x(0) y(0) - J x(s)y'(s) ds , 

V =0 =0 / 

(7.324) 

= x'(l)y(l)-x'(0)y(0)- f x 1 \s)y' \s) ds - f x(s)y , (s)ds, (7.325) 

Jo Jo 

= x'(l)y{l)-x'(0)y{0)- \x(l)y'(l) -x(0)y'{0) - f x(s)y"(s)ds) - f x(s)y , (s)ds, 

\ V -V / ^-n^ Jo / Jo 

\ =0 =0 / 

(7.326) 

= x'(l)y(l)-x'(0)y(0)+ f x(s)y"(s)ds- f x(s)y / (s) ds, (7.327) 

Jo Jo 

= x'(l)y(l)-x'(0)y(0)+ f x(s)(y"(s)-y'(s)) ds. (7.328) 

Jo 

This maintains the form of an inner product in L2[0, 1] if we require y(0) = y(l) = 0; doing this, we get 

<Lx, y>= [ x(s) (y"(s) - y'(s)) ds = <x, L*y>. (7.329) 

Jo 



We see by inspection that the adjoint operator is 

f_ d_ 
ds 2 ds 
Because the adjoint operator is not equal to the operator itself, the operator is not self-adjoint. 

I 



L* = 73 - -■ (7-330) 



I 

Example 7.36 

Find the adjoint of the differential operator L : X — > X, where L = d 2 /ds 2 , and X is the subspace 
of L 2 [0, 1] with x(0) = x(l) = if x e X. 
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Using integration by parts on the inner product 

<Lx,y> = / x"(s)y{s)ds, (7.331) 

JO 

= x'(l)y(l) - x'(0)y(0) -f x'(s)y'(s) ds, (7.332) 

Jo 

= x'(l)y(l) - x'(0)y(0) - x(l) y'(l) - x(0) y'(0) - f x(s)y"(s) ds , (7.333) 

\ =0 =0 / 

= x'(l)y(l) - x'(0)y(0) + [ x(s)y"(s) ds. (7.334) 

Jo 

If we require y(0) = y(l) = 0, then 

<~Lx,y> = / x(s)y"(s) dt = <x,L*y>. (7.335) 

In this case, we see that L = L*, so the operator is self-adjoint. 



I 

Example 7.37 

Find the adjoint of the integral operator L : L2[a, b] — > L2[a, b], where 



The inner product 



where 



or equivalently 



Lx= / K{s,t)x(s) ds. (7.336) 

J a 



<Lx,y> = ( K(s,t)x(s) ds\y(t) dt, (7.337) 

fb rb 

= K(s,t)x(s)y(t) ds dt, (7.338) 

J a J a 

f'b rb 

= x(s)K(s,t)y(t) dt ds, (7.339) 

J a J a 

= / x(s)( K{s,t)y{t) dt\ds, (7.340) 

= <x,L*y, > (7.341) 

h*y= f K(s,t)y{t) dt, (7.342) 

L*y= J K(t,s)y(s)ds. (7.343) 
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Note in the definition of La;, the second argument of K is a free variable, while in the consequent 
definition of L*y, the first argument of K is a free argument. So in general, the operator and its adjoint 
are different. Note however, that 

if K (s, t) = K (t, s), then the operator is self-adjoint. (7.344) 

That is, a symmetric kernel yields a self-adjoint operator. 



Properties: 

||L*|| = ||L||, (7.345) 

(Lx + L 2 )* = L* + L;, (7.346) 

(aL)* = aL*, (7.347) 

(LiL 2 )* = L^L*, (7.348) 

(L*)* = L, (7.349) 

(L" 1 )* = (L*) _1 , if L" 1 exists. (7.350) 

7.4.3 Inverse operators 

Let 

Lx = y. (7.351) 

If an inverse of L exists, which we will call L _1 , then 

x = L-y (7.352) 

Using Eq. (17.352ft to eliminate x in favor of y in Eq. (J7.351D . we get 

LL^y = y, (7.353) 

=x 

so that 

LL" 1 = I. (7.354) 

A property of the inverse operator is 

(LJ,,)- 1 = L^L- 1 (7-355) 

Let's show this. Say 

y = L a L fe x. (7.356) 

Then 

K l V = Ux, (7.357) 

L-'L-'y = x. (7.358) 
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Consequently, we see that 

(LJ^-^L^L- 1 - (7.359) 



I 

Example 7.38 

Let L be the operator defined by 

Lx = (^ + k 2 ) x(t) = f(t), (7.360) 

where x belongs to the subspace of L 2 [0, tt] with x(0) = a and x(ir) = b. Show that the inverse operator 
L _1 is given by 

x(t) = L- 1 f(t)=b^(n,t)-a^(0 1 t) + J g{r,t)f{r) dr, (7.361) 

where g(r,t) is the Green's function. 

From the definition of L and L _1 in Eqs. (|7.360I7.361|) . we get 

L-\Lx)=b^(n 1 t)-a^(0,t) + j\(T,t)(^-^ + k 2 x(T)) dr. (7.362) 

S v ' 

=/(t) 

Using integration by parts and the property that g(0, t) = g(n, t) = 0, the integral on the right side of 
Eq. (J7.362I can be simplified as 



g(T : t)[^l + k 2 x(T)) dr - *(tt) || (tt, t) + a;(0) |? (0, *) 



dr 2 ' J ^J- dr y ' ^J- dr 

=b 



4- / t.(t\ ( 

d 



x(t) ( f-§ + k 2 g ) dr. (7.363) 



=S(t-T) 

Since x(0) = a, x(ir) = b, and 

J| + k 2 g = 5{t - r), (7.364) 

we have 

L- x (La;) = / x{r)5{t - r) dr, (7.365) 

= x(i). (7.366) 

Thus, L _1 L = I, proving the proposition. 

Note, it is easily shown for this problem that the Green's function is 

sinffc(7r — t)) sin(fct) 
9(r,t) = , ■ ( !\ t<r, (7.367) 

sinffcr) sin(fc(7r — t)) 

= — ! ■ (h \ — - -r<t, ( 7 - 368 ) 

fcsm(K7r) 
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so that we can write x(t) explicitly in terms of the forcing function f(t) including the inhomogeneous 
boundary conditions as follows: 



x(t) 



fcsin(fci) asin(fc(7r — £)) 
sin(/c7r) sin(fc7r) 

sin(A;(7r — t)) 



ksm(ki:) J 



/(r) sin(fcr) dr 



sin(fct) 
ksm(k7:) 



f(r) sin(fc(7r — t)) dr. 



(7.369) 
(7.370) 



J 



For linear algebraic systems, the reciprocal or dual basis can be easily formulated in 
terms of operator notation and is closely related to the inverse operator. If we define U to 
be a iV x iV matrix which has the N basis vectors u n , each of length N, which span the 
iV-dimensional space, we seek U^, the N x N matrix which has as its columns the vectors 
u^ which form the reciprocal or dual basis. The reciprocal basis is found by enforcing the 
equivalent of <u n ,u R % > = S nm : 

I. (7.371) 



V T -v R 



Solving for U 



B 



u T -V R 
v T -V R 



u T • ir 



T 



—R T 

u -u 
\j rT ■ u • in 1 



r 



T 



jjR 
V R 



•ir\ 



U" 1 , 



u 



(7.372) 
(7.373) 

(7.374) 

(7.375) 
(7.376) 

(7.377) 
(7.378) 



we see that the set of reciprocal basis vectors is given by the conjugate transpose of the inverse 
of the original matrix of basis vectors. Then the expression for the amplitudes modulating 
the basis vectors, a n = <u^x>, is 



a 



V rT ■ x. 



Substituting for U in terms of its definition, we can also say 

-T 



a = V 



■T 



U _1 -x. 



(7.379) 



(7.380) 



^ 



^ 



Then the expansion for the vector x = Yln=i a n u n = Sn=i <n n? ;r>n n ls written in the 
alternate notation as 

x = U-e* = U-ir 1 -x = x. (7.381) 
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I 

Example 7.39 

Consider the problem of a previous example with x = I and with basis vectors u\ = I and 

U2 = ( o I j find the reciprocal basis vectors and an expansion of x in terms of the basis vectors. 

Using the alternate vector and matrix notation, we define the matrix of basis vectors as 

u =(o I)- ^ 

Since this matrix is real, the complex conjugation process is not important, but it will be retained for 
completeness. Using standard techniques, we find that the inverse is 

1^ = (J -$y (7.383) 

Thus, the matrix with the reciprocal basis vectors in its columns is 

U R = TFT T =( \ J ) . (7.384) 

\ 6 3 / 

This agrees with the earlier analysis. For x = (3, 5) T , we find the coefficients a to be 

„=u^. x = (i ■?)■(;)-(}). cm 

We see that we do indeed recover x upon taking the product 

~"-(s;)-(lHG) + f(;)-(0- 



7.4.4 Eigenvalues and eigenvectors 

Let us consider here in a more formal fashion topics that have been previously introduced 
in Sees. 15.11 and 16.2.51 If L is a linear operator, its eigenvalue problem consists of finding a 
nontrivial solution of the equation 

Le = Ae, (7.387) 

where e is called an eigenvector, and A an eigenvalue. 

Theorem 

The eigenvalues of an operator and its adjoint are complex conjugates of each other. 
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<Le,e*> = 


= <e,L*e*>, 


<Ae,e*> = 


= <e,A*e*>, 


A<e,e*> = 


= A*<e,e*>, 


A = 


= A*. 
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Proof: Let A and A* be the eigenvalues of L and L*, respectively, and let e and e* be the 
corresponding eigenvectors. Consider then, 

(7.388) 
(7.389) 
(7.390) 
(7.391) 

This holds for <e, e*> ^ 0, which will hold in general. 

Theorem 

The eigenvalues of a self-adjoint operator are real. 

Proof: 

Since the operator is self-adjoint, we have 

(7.392) 
(7.393) 
(7.394) 
(7.395) 
Afl.Ajel 2 , (7.396) 

(7.397) 
(7.398) 
(7.399) 

Here we note that for non-trivial eigenvectors <e, e> > 0, so the division can be performed. 
The only way a complex number can equal its conjugate is if its imaginary part is zero; 
consequently, the eigenvalue must be strictly real. 

Theorem 

The eigenvectors of a self- adjoint operator corresponding to distinct eigenvalues are or- 
thogonal. 

Proof: Let \ and Aj be two distinct, Aj ^ Xj, real, Xi, Xj G R 1 , eigenvalues of the self-adjoint 
operator L, and let d and ej be the corresponding eigenvectors. Then, 

(7.400) 
(7.401) 
(7.402) 
(7.403) 
(7.404) 

since Aj ^ Xj. 
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<Le, e> 


= 


<e, Le>. 


<Ae, e> 


= 


<e, Ae>, 


A<e, e> 


= 


A<e, e>, 


A 


= 


A, 


Xr - iXi 


= 


X R + iAj 


Xr 


= 


A_R, 


-Xr 


= 


A/, 


Xj 


= 


0. 



<Lei,ej> = 


= <ei,Lej>, 


< ^- / n^i) ^2 


\C^, Aj€<j*> ' 


Ai^Gi, Gj ^> 


— Aj *C G{ , GjS>) 


>(X i -X j ) -- 


= o, 


<s C{ , Cj J> 


= o, 
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Theorem 

The eigenvectors of any self-adjoint operator on vectors of a finite- dimensional vector 
space constitute a basis for the space. 

As discussed by Friedman, the following conditions are sufficient for the eigenvectors in 
an infinite-dimensional Hilbert space to be form a complete basis: 

• the operator must be self-adjoint, 

• the operator is defined on a finite domain, and 

• the operator has no singularities in its domain. 

If the operator is not self-adjoint, Friedman (p. 204) discusses how the eigenfunctions of 
the adjoint operator can be used to obtain the coefficients a.}, on the eigenfunctions of the 
operator. 



I 

Example 7.40 

For xgR 2 ,A:]R 2 ^l 2 , Find the eigenvalues and eigenvectors of 



The eigenvalue problem is 



which can be written as 



where the identity matrix is 



If we write 



i 2 : 



Ace = \x, (7.406) 



Aa; = Ala;, (7.407) 

(A-AI)x = 0, (7.408) 



>01' 



' z 



then 



2-A 1 U u \ _ / 
1 2-\{x 2 -\0 



By Cramer's rule we could say 

1 



(7.411) 



de \o 2-a; o 

Xl = t-^ ^v- = 7 r-, 7.412 

A J 2 ~ X 1 \ A J 2 ~ X l 

1 2-Aj det V 1 2-A 
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det (V oy 

x 2 = t± ^- = 7 r-. (7.413) 

A J 2 ~ X 1 \ A J 2 ~ X 1 

det { 1 2-aJ det ( 1 2-A, 

An obvious, but uninteresting solution is the trivial solution x\ = 0, x 2 = 0. Nontrivial solutions of x\ 
and X2 can be obtained only if 

2-A 1 
1 2-A 



0, (7.414) 



which gives the characteristic equation 

(2-A) 2 - 1 = 0. (7.415) 

Solutions are Ai = 1 and X 2 = 3. The eigenvector corresponding to each eigenvalue is found in the 
following manner. The eigenvalue is substituted in Eq. (|7.411|) . A dependent set of equations in x\ and 
X2 is obtained. The eigenvector solution is thus not unique. 
For A = 1, Eq. (|7.411|) gives 



V ^ s - ! ! C:; - 



(7.416) 



which are the two identical equations, 

X!+x 2 =0. (7.417) 

If we choose x± = 7, then x 2 = —7. So the eigenvector corresponding to A = 1 is 

ei = 7 ( _\ ) • (7-418) 

Since the magnitude of an eigenvector is arbitrary, we will take 7 = 1 and thus 



ei = ( _\ J • (7.419) 



For A = 3, the equations are 



2-3 1 \(x 1 \ = (-l 1 \(xi ) = ( 

1 2-3 U 2 1 -1 U 2 U 



(7.420) 



which yield the two identical equations, 

-x 1 +x 2 = 0- (7.421) 

This yields an eigenvector of 

e 2 = ( M • (7.422) 

We take (3 = 1, so that 

e 2=(})- ( 7 ' 423 ) 

Comments: 

• Since the real matrix is symmetric (thus, self-adjoint), the eigenvalues are real, and the eigenvectors 
are orthogonal. 
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• We have actually solved for the right eigenvectors. This is the usual set of eigenvectors. The left 

eigenvectors can be found from x A = x IX. Since here A is equal to its conjugate transpose, 

x A = Ax, so the left eigenvectors are the same as the right eigenvectors. More generally, we can 

— T 
say the left eigenvectors of an operator are the right eigenvectors of the adjoint of that operator, A . 



• Multiplication of an eigenvector by any scalar is also an eigenvector. 

• The normalized eigenvectors are 





ei= I _ i J' e2= [-L I (7 ' 424) 

• A natural way to express a vector is on orthonormal basis as given here 





• The set of orthonormalized eigenvectors forms an orthogonal matrix Q; see p. 11831 or the upcoming 
Sec. 18.61 Note that it has determinant of unity, so it is a rotation. As suggested by Eq. (|6.54j) . the 
angle of rotation here is a = sin - (— l/v2) = — 7r/4. 



I 

Example 7.41 

For x £ C 2 , A : C 2 — > C 2 , find the eigenvalues and eigenvectors of 



2 o 2 : 



This matrix is anti-symmetric. We find the eigensystem by solving 

(A-AI)e = 0. (7.427) 

The characteristic equation which results is 

A 2 + 4 = 0, (7.428) 

which has two imaginary roots which are complex conjugates: Ai = 2i, A2 = —2i. The corresponding 
eigenvectors are 

ei = «(j). e2 = ^(T)' (7 ' 429) 

where a and (3 are arbitrary scalars. Let us take a = — i, (3 = I, so 

er=(_M, e 2 =("/). (7.430) 
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Note that 



<ei,e 2 > = eT T e 2 = (l i) ( ^ I = (-*) + i = 0, (7.431) 



so this is an orthogonal set of vectors, even though the generating matrix was not self-adjoint. We can 
render it orthonormal by scaling by the magnitude of each eigenvector. The orthonormal eigenvector 
set is 

These two orthonormalized vectors can form a matrix Q: 

Q = ( J5_ _P J • (7.433) 

It is easy to check that ||Q||2 = 1 and det Q = 1, so it is a rotation. However, for the complex basis 
vectors, it is difficult to define an angle of rotation in the traditional sense. Our special choices of a 
and [3 were actually made to ensure det Q = 1 . 

I 



I 

Example 7.42 

For x £ C 2 , A : C 2 — > C 2 , find the eigenvalues and eigenvectors of 

1 -1 
1 



(7.434) 



This matrix is asymmetric. We find the eigensystem by solving 

(A-AI)e = 0. (7.435) 

The characteristic equation which results is 

(1-A) 2 =0, (7.436) 

which has repeated roots A = 1, A = 1. For this eigenvalue, there is only one ordinary eigenvector 



We take arbitrarily a = 1 so that 



' 



( I ) 



We can however find a generalized eigenvector g such that 

(A - XI) g = e. (7.439) 

Note then that 

(A- AI)(A- Xl)g = (A- XI) e, (7.440) 

(A - AI) 2 .g = 0. (7.441) 
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Now 

(A " AI)= (o "o 1 )' (7 ' 442) 

So with g = (/3,7) T , take from Eq. (|7.439)) 

'0 -1 \ { p\ (I 



o on 7 y- V o< ( 7 - 443 ) 



A-AI 

We get a solution if /? e R 1 , 7 = -1. That is 



»=(-l)- ( 7 ' 444 ) 

Take (3 = to give an orthogonal generalized eigenvector. So 

ff=(_° 1 )- (7-445) 

Note that the ordinary eigenvector and the generalized eigenvector combine to form a basis, in this case 
an orthonormal basis. 

More properly, we should distinguish the generalized eigenvector we have found as a generalized 
eigenvector in the first sense. There is another common, unrelated generalization in usage which we 
will study later in Sec. 18.3.21 

I 



I 

Example 7.43 

For x G C 2 , A : C 2 — > C 2 , find the eigenvalues, right eigenvectors, and left eigenvectors if 

-3 l)" (7 ' 446) 

The right eigenvector problem is the usual 

Ae R = \Ie R . (7.447) 

The characteristic polynomial is 

(l-A) 2 + 6 = 0, (7.448) 

which has complex roots. The eigensystem is 

2 „• \ — / 2 



A 1 = l-v / 6i J e 1R ={\3 l \, X 2 = l + V6i, e 2R =i V 3*1 . (7.449) 

Note as the operator is not self-adjoint, we are not guaranteed real eigenvalues. The right eigenvectors 
are not orthogonal as ~e\ R ei R = 1/3. 
For the left eigenvectors, we have 

e T L A = e£lA. (7.450) 

We can put this in a slightly more standard form by taking the conjugate transpose of both sides: 

\CC BY-NC-TW} 29 July 2012, Sen & Powers. 



290 CHAPTER 7. LINEAR ANALYSIS 



e T L A - 


= e T L l\ , 


A e L -- 


= lXe L , 


A e L = 


= IAe L , 


A*e L -- 


= l\*e L . 



So the left eigenvectors of A are the right eigenvectors of the adjoint of A. Now we have 



-t (1-3 
1 



(7.451) 

(7.452) 

(7.453) 
(7.454) 



(7.455) 



The resulting eigensystem is 



A* = l + V6z, eiL=(v2M, \l = l-^6i, e 2L =l V a*) . (7.456) 

Note that in addition to being complex conjugates of themselves, which does not hold for general 
complex matrices, the eigenvalues of the adjoint are complex conjugates of those of the original matrix, 
which does hold for general complex matrices. That is A* = A. The left eigenvectors are not orthogonal 
as eiL T G2L = — g. It is easily shown by taking the conjugate transpose of the adjoint eigenvalue problem 
however that 

e T L A = e T L \ (7.457) 

as desired. Note that the eigenvalues for both the left and right eigensystems are the same. 

I 



I 

Example 7.44 

Consider a small change from the previous example. For x G C 2 , A : C 2 — > C 2 , find the eigenvalues, 
right eigenvectors, and left eigenvectors if 

A =(-3 1 + i)" ( ? - 458 ) 

The right eigenvector problem is the usual 

Ae R = Me R . (7.459) 

The characteristic polynomial is 

A 2 - (2 + i)A + (7 + i) = 0, (7.460) 

which has complex roots. The eigensystem is 

Ai = l-2i, e 1R =( l \ A 2 = l + 3i, e 2fi =f~ 3 2 M. (7.461) 

Note as the operator is not self-adjoint, we are not guaranteed real eigenvalues. The right eigenvectors 
are not orthogonal as 'e\R T ^2R = 1/0 

ICC BY-JVC-MXI 29 July 2012, Sen & Powers. 



7.4. OPERATORS 



291 



For the left eigenvectors, we solve the corresponding right eigensystem for the adjoint of A which 

— T 

is A* = A . 

(7.462) 



a t - r 

1 2 -i 



The eigenvalue problem is A &l = A*e^. The eigensystem is 



A* = 1 + 2i, e 



IL 



3/ 



X* = 1 - 3z, e 2 L 



1 



(7.463) 



Note that here, the eigenvalues X\ , A2 have no relation to each other, but they are complex conjugates 
of the eigenvalues, Ai, A2, of the right eigenvalue problem of the original matrix. The left eigenvectors 
are not orthogonal as eit 1 &2L = —1. It is easily shown however that 



as desired. 



e T L A = elXI, 



(7.464) 



I 

Example 7.45 

For x € 



find the eigenvalues and eigenvectors of 

/ 2 \ 
A= 1 1 . 
V 1 1 / 



(7.465) 



From 



the characteristic equation is 



2- A 

1- A 1 
1 1- A 



0, 



(7.466) 



(2 - A) ((1 - A) 2 - 1) = 0. (7.467) 

The solutions are A = 0, 2, 2. The second eigenvalue is of multiplicity two. Next, we find the eigenvectors 



■r-2 

X3 

For A = 0, the equations for the components of the eigenvectors are 



(7.468) 



2 
1 1 
1 1 



x\ 

■r-2 

2xi 

X2 +23 



0. 

0. 



(7.469) 

(7.470) 
(7.471) 



from which 



e\ = a \ 1 



(7.472) 
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For A = 2, we have 

'0 \ /cci\ /0 N 

0-1 1 a* = | ■ (7-473) 

,0 1 -1/ W \o, 

This yields only 

-x 2 +x 3 = 0. (7.474) 

We then see that the following eigenvector, 

e = I 7 I . (7-475) 

satisfies Eq. (|7.474[) . Here, we have two free parameters, /? and 7; we can thus extract two independent 
eigenvectors from this. For e 2 we arbitrarily take (3 = and 7 = 1 to get 

e 2 = I 1 

For e3 we arbitrarily take (3 = 1 and 7 = to get 



In this case ei, e2, e^ are orthogonal even though e 2 and e 3 correspond to the same eigenvalue. 

I 



I 

Example 7.46 

For y € L 2 [0, 1], find the eigenvalues and eigenvectors of L = —d 2 /dt 2 , operating on functions which 
vanish at and 1. Also find IlLlU. 



The eigenvalue problem is 



d 2 y 
dt 2 



L 2/=-dr = A 2/' 2/(0) = 2/(1) = 0, (7.478) 



^| + Ay = 0, y(0) = y(l) = 0. (7.479) 

The solution of this differential equation is 

y(t) = a sin \ 1/2 t + b cos X 1/2 t. (7.480) 

The boundary condition y(0) = gives 6 = 0. The other condition y(l) = gives a sin A 1 ' 2 = 0. A 
nontrivial solution can only be obtained if 

sinA 1/2 =0. (7.481) 
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There are an infinite but countable number of values of A for which this can be satisfied. These are 
A„ = n 2 ir 2 , n = 1, 2, • ■ •. The eigenvectors (also called eigenf unctions in this case) y n {t), n = 1, 2, • • • 
are 

y n (t) = sinmrt. (7.482) 

The differential operator is self-adjoint so that the eigenvalues are real and the eigenfunctions are 
orthogonal. 

Consider ||L||2. Referring to the definition of Eq. (|7.299j) . we see ||L||2 = oo, since by allowing y to 
be any eigenfunction, we have 

TTTi — = 11 11 1 (IA06) 

\\y\\2 \\y\\2 

= l W k > (^84) 

\\y\\2 

= |A|. (7.485) 

And since A = n 2 ir 2 , n = 1,2, . . . ,00, the largest value that can be achieved by 1 1 IjJ/ 1 1 2 / 1 1 2/ 1 1 2 is infinite. 



I 

Example 7.47 

For x £ L2P, 1], and L = d 2 /ds 2 + d/ds with a;(0) = x(l) = 0, find the Fourier expansion of an 
arbitrary function f(s) in terms of the eigenfunctions of L. Find the series representation of the "top 
hat" function 

m=H(s-^\-H(s-^\. (7.486) 

We seek expressions for a n in 

N 



f(s) = J2a n x n (s). (7.487) 



Here x n (s) is an eigenfunction of L. 
The eigenvalue problem is 



d x dx 
Lx = —r + — = \x, x(0) = x(l) = 0. (7, 

ds z ds 

It is easily shown that the eigenvalues of L are given by 

A n = ---nV, n = 1,2,3,... (7.489) 

where n is a positive integer, and the unnormalized eigenfunctions of L are 

x n {s) = e _s/2 sin (nTrs) , n = 1,2,3,... (7.490) 

Although the eigenvalues are real, the eigenfunctions are not orthogonal. We see this, for example, 
by forming <xi,X2>: 

<xi,x 2 > = / e~ s/2 sin (its) e~ s/2 sin (2ns) ds, (7.491) 

=xi(s) =x 2 (s) 

4(1 + e)?r 2 

e(l+7T 2 )(l + 97T 2 ) 



<xn*2> = _„ ,U f ; xQ ,^ "- (7-492) 
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By using integration by parts, we calculate the adjoint operator to be 



dry _ dy_ 

ds 2 ds 



Vy=-JL--f = \*y, 2,(0) = y(l) = 0. (7.493) 



We then find the eigenvalues of the adjoint operator to be the same as those of the operator (this is 
true because the eigenvalues are real; in general they are complex conjugates of one another). 

X* m = X~=---m 2 Tr 2 , m = 1,2,3,... (7.494) 

where m is a positive integer. 

The unnormalized eigenfunctions of the adjoint are 

Vm (s) = e s/2 sin (mTTs) , m = 1,2,3,... (7.495) 

Now, since by definition <y m ,~Lx„> = <L*y mi x n >, we have 

<y m ,Lx n >- <L*y m ,x n > = 0, (7.496) 

<y m , X n x„> - <X* m y m ,x n > = 0, (7.497) 

X n <y m ,x n > - X* m <y m ,x n > = 0, (7.498) 

(K-X m )<y m ,x n > = 0. (7.499) 

So, for m = n, we get <y n ,x n > ^ 0, and for m ^ n, we get <y m ,x n > = 0. Thus, we must have the 
so-called bi-orthogonality condition 

<y m ,x n > = D mn , (7.500) 

D mn = if m^n. (7.501) 

Here D mn is a diagonal matrix which can be reduced to the identity matrix with proper normalization. 
Now consider the following series of operations on the original form of the expansion we seek 

JV 

f(s) = ^a„x„(s), (7.502) 

n = l 

N 

< yj (s)J(s)> = <y j (s),^2a n x n (s)>, (7.503) 

n = l 

N 

<%•(*)./(*)> = J2a n < yj (s),x n (s)>, (7.504) 

n=l 

<yj(s),f(s)> = a j <y j (s),x j (s)>, (7.505) 



<y j (s),x j {s)> 

<yn(s)J(s)> 

<y„(s),x„(s)> : 
Now in the case at hand, it is easily shown that 



(7.506) 
n = 1,2,3,... (7.507) 



<y n {s),x n {s)> = -, n= 1,2,3,..., (7.508) 

so we have 

a n = 2<y n (s),f(s)>. (7.509) 
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Figure 7.12: Twenty-term Fourier series approximation to a top hat function in terms of a 
non- orthogonal basis. 



The ./V-term approximate representation of /(s) is thus given by 

f(s) ~ V (2 [ e t/2 sin (mrt)f(t) dt] e~ s/2 sin (nTrs), 
„_-, V Jo ) " . ' 



= a 
N 



.(») 



„1 JV 

/ e {t - s)/2 f{t) Y, sin(mri) sin(nTrs) dt, 
Jo _i 



n=l 

N 



f e {t ~ s)/2 f(t) V (cos(n7r(s - t)) - cos(mr(s + t))) dt. 
Jo , 



(7.510) 

(7.511) 
(7.512) 



For the top hat function, a two-term expansion yields 

„ x 2^e 1 / 8 (-l + 27r + e 1 / 4 (l + 27r)) _ s/2 . ^e 1 / 8 + e 3 / 8 ) _ s/2 . . Q , 
/(s) ^ ——5 ^e 4/2 sm(w) \ e s ' 2 sin(27rs) + 



1+47T 2 



1 + IGtt 2 



=xi(s) 



=x 2 {s) 



(7.513) 



A plot of a twenty-term series expansion of the top hat function is shown in Fig. 17.121 

In this exercise, the eigenfunctions of the adjoint are closely related to the reciprocal basis functions. 
In fact, we could have easily adjusted the constants on the eigenfunctions to obtain a true reciprocal 
basis. Taking 






V2e s/2 sin(mrs), 
V / 2e s / 2 sin(m7rs), 



(7.514) 
(7.515) 



gives <y m ,x n > = 6 mn , as desired for a set of reciprocal basis functions. We see that getting the 
Fourier coefficients for eigenfunctions of a non-self-adjoint operator requires consideration of the adjoint 
operator. We also note that it is often a difficult exercise in problems with practical significance to 
actually find the adjoint operator and its eigenfunctions. 



J 
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7.5 Equations 

The existence and uniqueness of the solution x of the equation 

Lx = y, (7.516) 

for given linear operator L and y is governed by the following theorems. 

Theorem 

If the range of L is closed, Lx = y has a solution if and only if y is orthogonal to every 
solution of the adjoint homogeneous equation L*z = 0. 

Theorem 

The solution of Lx = y is non-unique if the solution of the homogeneous equation Lx = 
is also non-unique, and conversely. 

There are two basic ways in which the equation can be solved. 

• Inverse: If an inverse of L exists then 

x = L~ 1 y. (7.517) 

• Eigenvector expansion: Assume that x, y belong to a vector space § and the eigenvec- 
tors (ei, e 2 , • • •) of L span S. Then we can write 

y = ^2a n e n , (7.518) 

n 

£/W (7-519) 



x 



where the a's are known and the /?'s are unknown. We get 

Lx = y, (7.520) 

L I ^2p n e n ) = ^2a n e n , (7.521) 

\ n / n 

x y 

^L/? n e n = ^a„e n , (7.522) 

n n 

^2p n Le n = ^a n e n , (7.523) 

n n 

^2/3 n Ke n = ^2<y n e n , (7.524) 

n n 

y2{PnXn-a n )e n = 0, (7.525) 

" v ' 

=0 
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where the As are the eigenvalues of L. Since the e n are linearly independent, we must 
demand for all n that 

f3 n X n = a n . (7.526) 

If all X n 7^ 0, then f3 n = a n /X n and we have the unique solution 

Or 

A, 



* = E ? e - ( 7 - 527 ) 



If, however, one of the A's, A& say, is zero, we still have j3 n = a n /X n for n ^ k. For 
n = k, there are two possibilities: 



— If a.k 7^ 0, no solution is possible since equation (I7.526P is not satisfied for n = k. 

— If ftfc = 0, we have the non-unique solution 



x = J2 ir e n + 7e fc , (7.528) 



where 7 is an arbitrary scalar. Equation (I7.526P is satisfied Vn. 



I 

Example 7.48 

Solve for x in La; = y if L = d 2 /dt 2 , with side conditions or(0) = x(l) = 0, and j/(i) = 2i, via an 
eigenfunction expansion. 



This problem of course has an exact solution via straightforward integration: 

di 

integrates to yield 



_= 2i; 3.(0) = x(l) = 0, (7.529) 



x(t) = ^(t 2 - 1). (7.530) 

However, let's use the series expansion technique. This can be more useful in other problems in 
which exact solutions do not exist. First, find the eigenvalues and eigenfunctions of the operator: 



- Xx; x(0) = x(l) = 0. (7.531) 



<f_x 

This has general solution 

x(t) = A sin (V^Xt) + B cos (y/^Xt) . (7.532) 

To satisfy the boundary conditions, we require that B = and A = — n 2 7r 2 , so 

x(t) = Asm{nirt). (7.533) 

This suggests that we expand y(t) = 2t in a Fourier sine series. We know from Eq. (|7.220j) that the 
Fourier sine series for y(t) = 2t is 

It = Y^ ( \ — sin(nTrt). (7.534) 

,1=1 ^ n7r > 
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Figure 7.13: Approximate and exact solution x(t); Error in solution x p (t) — x(t). 



For x(t) then we have 






Substituting in for A n = —nw, we get 



x(t) = £ 



4(-l) 



ra+1 



n=l v ; 

Retaining only two terms in the expansion for x(t), 



■ sin(n7rt). 



4 • , s 1 
■sin(Trt)-, 

7T" 27T J 



x(i) ~ j sin(7ri) H , sin(27rf), 



(7.535) 



(7.536) 



(7.537) 



gives a very good approximation for the solution, which as shown in Fig. 17.131 has a peak error of about 
0.008. 

I 



I 

Example 7.49 

Solve Ax = y using the eigenvector expansion technique when 



2 1 
1 2 



(7.538) 



We already know from an earlier example, p. 12851 that for A 

1 



We want to express y as 



Ai = 1, ei 



A2 = 3, e 2 



y = a.\e\ + a-ie?,- 



1 



(7.539) 
(7.540) 
(7.541) 
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Since the eigenvectors are orthogonal, we have from Eq. (|7. 186[) 

<ei,y> = 3-4 = _ 1 
<ei,ei> 1 + 1 - 2' 

<e 2 ,y> _ 3 + 4 _ 7 
<e 2 ,e 2 > ~ 1 + 1 ~ 2' 



<ei,y> 3-4 1 

ai = = = — , (7.542) 

^ei,ei> 1 + 1 2' 

<£2,y> 3 + 4 7 
" 2 = ^„ : ^ = 7TT = o' (7.543) 



Then 



--ei + -e 2 . (7.544) 



V-ei + ^e 2 , (7.545) 

Ai A 2 

4r i+ ^ e2 ' (7 - 546) 

-n^ + si 69 ' (7 - 547) 

11 ^ 71/n (7.548) 

(7.549) 

I 



2 1 \ - 1 / 23 \ 1 

2 



I 

Example 7.50 

Solve Ax = y using the eigenvector expansion technique when 

4 2)' V= U)> V= (l 



We first note that the two column space vectors, 



4/ ' \2) 



are linearly dependent. They span R , but not ]R 2 . 
It is easily shown that for A 



2/ 



2 



First consider y = [ . ) ■ We want to express y as 



(7.550) 



(7.551) 



(7.552) 



A 2 = 0, e 2 = I - 1 I . (7.553) 



y = aid + a 2 e 2 . (7.554) 
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For this non-symmetric matrix, the eigenvectors are linearly independent, so they form a basis. However 
they are not orthogonal, so there is not a direct way to compute a\ and a 2 . Matrix inversion shows 
that «i = 5/2 and a 2 = —1/2, so 

5 1 ,„ . 

y=-ei--e 2 - (7.555) 

Since the eigenvectors form a basis, y can be represented with an eigenvector expansion. However no 
solution for x exists because A2 = and a 2 7^ 0, hence the coefficient {j% = a 2 /X 2 does not exist. 

However, for y = I I , we can say that 

y = 3ei + 0e 2 . (7.556) 

We note that (3, 6) T is a scalar multiple of the so-called column space vector of A, (2, 4) T . Consequently, 

Oil a 2 

x = —e 1 + — e 2 , (7.557) 

Ai A2 

= -^1 + -e 2 , (7.558) 

Ai U 

3 

= ^e!+7e 2 , (7.559) 

= low?)- ( 7 - 56 °) 



/ 3/4-7 
^3/2 + 27 



(7.561) 



where 7 is an arbitrary constant. Note that the vector e 2 = (— 1, 2) lies in the null space of A since 

Ae 2 = (I J) ("a 1 ), (7-562) 

0). (7-563) 

Since e2 lies in the null space, any scalar multiple of e 2 , say 762, also lies in the null space. We can 
conclude that for arbitrary y, the inverse does not exist. For vectors y which lie in the column space of 
A, the inverse exists, but it is not unique; arbitrary vectors from the null space of A are admitted as 
part of the solution. 

I 



7.6 Method of weighted residuals 

The method of weighted residuals is a quite general technique to solve equations. Two 
important methods which have widespread use in the engineering world, spectral methods 
and the even more pervasive finite element method, are special types of weighted residual 
methods. 

Consider the differential equation 

Ly = f(t), te[a,b], (7.564) 
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with homogeneous boundary conditions. Here, L is a differential operator that is not neces- 
sarily linear. We will work with functions and inner products in L 2 [a, b] space. 
Approximate y(t) by 

N 

y{t)^y p {t) = J2 a nMt), (7-565) 

n=l 

where </)„(£), (n = 1, • • • , N) are linearly independent functions (called trial functions) which 
satisfy the boundary conditions. Forcing the trial functions to satisfy the boundary condi- 
tions, in addition to having aesthetic appeal, makes it much more likely that if convergence 
is obtained, the convergence will be to a solution which satisfies the differential equation 
and boundary conditions. The trial functions can be orthogonal or non-orthogonal] 21 ! The 
constants a n , (n = 1, • • • , N) are to be determined. Substituting into the equation, we get a 
residual 

r(t) = Ly p (t) - f(t). (7.566) 

Note that the residual r(t) is not the error in the solution, e(t), where 

e(t)=y(t)-y p (t). (7.567) 

The residual will almost always be non-zero for t G [a, b]. However, if r(t) = 0, then e(t) = 0. 
We can choose a n such that the residual, computed in a weighted average over the domain, is 
zero. To achieve this, we select now a set of linearly independent weighting functions ip m (t), 
(to = 1, • • • , N) and make them orthogonal to the residual. Thus, 

<ip m (t),r(t)> = 0, m = l,---,N. (7.568) 

These are TV equations for the constants a n . 

There are several special ways in which the weight functions can be selected. 



GalerkirS : ifj^t) = <j>i(t). 

Collocation: ip m (t) = 5(t — t m ). Thus, r(t m ) = 0. 

Subdomain if) m (t) = 1 for t m -i < t < t m and zero everywhere else. Note that these 
functions are orthogonal to each other. Also this method is easily shown to reduce to 
the well known finite volume method. 



21 It is occasionally advantageous, especially in the context of what is known as wavelet-based methods, to 
add extra functions which are linearly dependent into the set of trial functions. Such a basis is known as a 
frame. We will not consider these here; some background is given by Daubechies. 

22 Boris Gigorievich Galerkin, 1871-1945, Belarussian-born Russian-based engineer and mathematician, a 
participant, witness, and victim of much political turbulence, did much of his early great work in the Czar's 
prisons, developed a finite element method in 1915, professor of structural mechanics at what was once, and 
is now again, St. Petersburg (at one time known as Petrograd, and later Leningrad). 
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• Least squares: Minimize ||r(t)||. This gives 

f!!! «/V*, (7.569) 

da m da m J a 

b 1 ■ 
r - — dt. (7.570) 

So this method corresponds to ip n = dr/da n . 
Moments: ip m (t) = t m_1 , m = 1, 2, . . .. 



If the trial functions are orthogonal and the method is Galerkin, we will, following 
Fletcher, who builds on the work of Finlayson, define the method to be a spectral method. 
Other less restrictive definitions are in common usage in the present literature, and there is 
no single consensus on what precisely constitutes a spectral method] 23 ! 



I 

Example 7.51 

For x G Li2[0, 1], find a one-term approximate solution of the equation 

d 2 x 

— +x = t-l, (7.571) 

with x(0) = -1, x(l) = 1. 

It is easy to show that the exact solution is 

x(t) = -l + i + csc(l)sin(t). (7.572) 



23 An important school in spectral methods, exemplified in the work of Gottlieb and Orszag, Canuto, 
et at, and Fornberg, uses a looser nomenclature, which is not always precisely defined. In these works, 
spectral methods are distinguished from finite difference methods and finite element methods in that spectral 
methods employ basis functions which have global rather than local support; that is spectral methods' basis 
functions have non-zero values throughout the entire domain. While orthogonality of the basis functions 
within a Galerkin framework is often employed, it is not demanded that this be the distinguishing feature 
by those authors. Within this school, less emphasis is placed on the framework of the method of weighted 
residuals, and the spectral method is divided into subclasses known as Galerkin, tau, and collocation. The 
collocation method this school defines is identical to that defined here, and is also called by this school the 
"pseudospectral" method. In nearly all understandings of the word "spectral," a convergence rate which is 
more rapid than those exhibited by finite difference or finite element methods exists. In fact the accuracy of 
a spectral method should grow exponentially with the number of nodes for a spectral method, as opposed 
to that for a finite difference or finite element, whose accuracy grows only with the number of nodes raised 
to some power. 

Another concern which arises with methods of this type is how many terms are necessary to properly 
model the desired frequency level. For example, take our equation to be d 2 u/dt 2 = 1 + u 2 ; w(0) = u(ir) = 0, 
and take u = X^ n= i a « sin(rit). If N = 1, we get r(t) = — a\ sin t — 1 — a\ sin t. Expanding the square of the 
sin term, we see the error has higher order frequency content: r(t) = — a\ sini — 1 — af (1/2 — l/2cos(2i)). 
The result is that if we want to get things right at a given level, we may have to reach outside that level. 
How far outside we have to reach will be problem dependent. 
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Here we will see how well the method of weighted residuals can approximate this known solution. The 
real value of the method is for problems in which exact solutions are not known. 

Let y = x — (2i — 1), so that y(0) = y(l) = 0. The transformed differential equation is 

d 2 y 

J+V = -t- (7-573) 

Let us consider a one-term approximation, 

y~y p (t) = a<b(t). (7.574) 

There are many choices of basis functions 4>(t). Let's try finite dimensional non-trivial polynomials 
which match the boundary conditions. If we choose 4>(t) = a, a constant, we must take a = to satisfy 
the boundary conditions, so this does not work. If we choose <p(t) = a + bt, we must take a = 0, 6 = 
to satisfy both boundary conditions, so this also does not work. We can find a quadratic polynomial 
which is non-trivial and satisfies both boundary conditions: 

cj)(t) = t(l-t). (7.575) 

Then 

y p {t)=at{l-t). (7.576) 

We have to determine a. Substituting into Eq. (|7.566[) , the residual is found to be 

r(i) = Ly p -f(t) = ^f+y p -f(t), (7.577) 

-2a +at(l-t)-(-t) = t-a{t 2 -t + 2). (7.578) 

d 2 y p /dt 2 y p y( t ) 

Then, we choose a such that 



<tp(t), r(t)> = <ip(t),t - a(t 2 - t + 2)> = / ip(t) (t - a(t 2 -t + 2)) dt = 0. (7.579) 

Jo x v ' 

=r(t) 

The form of the weighting function ip(t) is dictated by the particular method we choose: 

1. Galerkin: i[>(t) = <fi(t) = t{\ — t). The inner product gives ttj — y^a = 0, so that for non-trivial 
solution, a = y| = 0.277. 

y p (t) = 0.277i(l - t). (7.580) 

x p (t) = 0.277t(l - t) + It - 1. (7.581) 

2. Collocation: Choose i[>(t) = S(t — s) which gives — Za + 1 = 0, from which a = S = 0.286. 

y p (t) = 0.286t(l-t), (7.582) 

x p (t) = 0.286i(l - t) + 2t - 1. (7.583) 

3. Subdomain: tp(t) = 1, from which — ^-a + ^ = 0, and a = ^j = 0.273 

y p (t) = 0.273t(l-i), (7.584) 

x p (t) = 0.273t(l - t) + 2t - 1. (7.585) 
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x(t) 



Exact solution and 




e(t)-x p (t)-x(t) Error in Galerkin approximation 



x"+x = t-1; x(0) = -1,x(1) = 1 




Figure 7.14: One-term estimate x p (t) and exact solution x(t); Error in solution x p (t) — x(t). 



4. Least squares: ip{t) = ^- = -t 2 + t - 2. Thus, -% + ^a = 0, from which a = ^ = 0.273. 



y p {t) = 0.273t(l - 1), 

x p (t) = 0.273i(l - t) + 2t - 1. 



(7.586) 
(7.587) 



5. Moments: ip(t) = 1 which, for this case, is the same as the subdomain method previously reported. 



y p (t) = 0.273t(l - 1), 

x p (t) = 0.273t(l - t) + 2t - 1. 



(7.588) 
(7.589) 



The approximate solution determined by the Galerkin method is overlaid against the exact solution in 
Fig. 17.141 Also shown is the error in the approximation. The approximation is surprisingly accurate. 
Note that the error, e(t) = x p (t) — x(t), is available because in this case we have the exact solution. 

I 



Some simplification can arise through use of integration by parts. This has the result of 
admitting basis functions which have less stringent requirements on the continuity of their 
derivatives. It is also a commonly used strategy in the finite element technique. 



I 

Example 7.52 

Consider a slight variant of the previous example problem, and employ integration by parts. 



dt 2 



+ y = f(t), 2/(0) = 0, y(l) = 0. 



(7.590) 



Again, take a one-term expansion 



y p (t) = a<j)(t). 



(7.591) 



At this point, we will only require <fi(t) to satisfy the boundary conditions, and will specify it later. The 
residual in the approximation is 



= ^f + y p -f(t) = a^ + a<t>-f(t). 



dC- 



dt 2 



(7.592) 
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Now set a weighted residual to zero. We will also require the weighting function ip(t) to vanish at the 
boundaries. 

' dt 2 



<V, r> -- / i>(t) ( a— % + c<j)(t) - f(t) ) dt = 0. (7.593) 



Rearranging, we get 

d 2 



ii r4>(t)-^r + i>(t)<P(t)) dt = J ^{t)f{t)dt. (7.594) 



Now integrate by parts to get 



< Kf 



+ f x (mm - ^) & ] = f mat) dt. (7.595) 

Since we have required ^>(0) = ip(l) = 0, this simplifies to 

aJ^(^{t)4>{t)-^^\ dt = J\{t)f{t)dt. (7.596) 

So, the basis function <j> only needs an integrable first derivative rather than an integrable second 
derivative. As an aside, we note that the term on the left hand side bears resemblance (but differs by 
a sign) to an inner product in the Sobolov space W 2 [0, 1] in which the Sobolov inner product <., .> s 
(an extension of the inner product for Hilbert space) is <^(t),<p(t)> s = J Q (ip(t)(f>(t) + -^f -gf ) dt. 

Taking now, as before, <fi = ^(1 — t) and then choosing a Galerkin method so ip(t) = <p(t) = t(l — t), 
and f(t) = —t, we get 



(t 2 (l-t) 2 -{l-2tf) dt= J t{l - t){-t) dt, (7.597) 

which gives 



10/ 12' 



(7.598) 



so 



as was found earlier. So 



with the Galerkin method. 



a = A (7.599) 

Vp = ^t(l-t), (7.600) 



J 



I 

Example 7.53 

For y £ L2[0, 1], find a two-term spectral approximation (which by our definition of "spectral" 
mandates a Galerkin formulation) to the solution of 

ijL + y/iy = l, 2/(0) = 0, y(l) = 0. (7.601) 
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Let's try polynomial basis functions. At a minimum, these basis functions must satisfy the boundary 
conditions. Assumption of the first basis function to be a constant or linear gives rise to a trivial basis 
function when the boundary conditions are enforced. The first non-trivial basis function is a quadratic: 

faty) = a + ai t + a 2 t 2 . (7.602) 

We need <^i(0) = and </>i(l) = 0. The first condition gives ao = 0; the second gives a\ = —a 2 , so we 
have 4>i = a\{t—t 2 ). Since the magnitude of a basis function is arbitrary, a\ can be set to unity to give 

( /> 1 (t)=t(l-i). (7.603) 

Alternatively, we could have chosen the magnitude in such a fashion to guarantee an orthonormal basis 
function, but that is a secondary concern for the purposes of this example. 

We need a second linearly independent basis function for the two-term approximation. We try a 
third order polynomial: 

0a (t) =b + b 1 t + b 2 t 2 + b 3 t 3 . (7.604) 

Enforcing the boundary conditions as before gives bo = and b\ = — (b 2 + b 3 ), so 

Mt) = -(62 + 63)* + M 2 + b 3 t 3 . (7.605) 

To achieve a spectral method (which in general is not necessary to achieve an approximate solution!), 
we enforce <<fti,4> 2 > = 0: 

0, (7.606) 



r 


t{i-t){- 


-{b 2 


+ b 3 )t + b 2 t 2 


+ ht 3 ) 


dt 




=4>i(t) 




=Mt) 


b 2 
30 


h 
20 

h 



Substituting and factoring gives 



-^ - 7^ 0, (7.607) 

— §63- (7-608) 



Mt) = jt(l-t)(2t-l). (7.609) 



Again, because <f> 2 is a basis function, the lead constant is arbitrary; we take for convenience 63 = 2 to 
give 

cj) 2 = t(l-t)(2t- 1). (7.610) 

Again, 63 could alternatively have been chosen to yield an orthonormal basis function. 
Now we want to choose ot\ and a 2 so that our approximate solution 

I/ P (t) = ai0i(t)+o a 0a(*), (7.611) 

has a zero weighted residual. With 

L=^ + Vi), (7.612) 

we have the residual as 

r(t) = Ly p {t) - f{t) = L (ai^i(t) + a 2 4> 2 {t)) - 1 = aiL^(i) + a 2 L</> 2 (£) - 1. (7.613) 
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To drive the weighted residual to zero, take 

<ipi,r> = a 1 <ip 1 ,~L(/) 1 > + a 2 <ipi,~L(t>2> - <tpi,l> = 0, (7.614) 

<ip2,r> = ai<-02,L^i> + a2<V'2,L02> - <02,1> = 0. (7.615) 

This is easily cast in matrix form as a linear system of equations for the unknowns u\ and ai 

<t/>l,Ii<f>l> <lpi,L(/)2>\ { ai\ f <tpi,l> 



<V>2,L</>1> <V'2,L</ , 2> I \ui I V<V'2,1> ' 



We choose the Galerkin method, and thus set ipi = <i>i and ^2 = <f>2, so 

<0i,L0i> <0i,L(/) 2 >\ ( a\ \ /<0i,l> 



<0 2 ,L0i> <02,L02>y \&i ) \<4>2,1> ,' 

Each of the inner products represents a definite integral which is easily evaluated via computer algebra. 
For example, 

<cf )1 ,Lcj> 1 > = I t(l- t) (-2 + (1 - t)t 5 ' 2 ) dt = -^-. (7.618) 

J - — , A / 693 

When each inner product is evaluated, the following system results 



215 16 

' 693 9009 



16 

9009 




(7.619) 



Inverting the system, it is found that 



760617 3432 

ai = -1415794 = -°- 537 < « 2 = -707897 = -°-°° 485 - ^ 

Thus, the estimate for the solution is 

y p (t) = -0.537 i(l - t) - 0.00485 i(l - t)(2t - 1). (7.621) 

The two-term approximate solution determined is overlaid against a more accurate solution obtained 
by numerical integration of the full equation in Fig. 17.151 Also shown is the error in the approximation. 
The two-term solution is surprisingly accurate. 

By normalizing the basis functions, we can find an orthonormal expansion. One finds that 



1 
2 



||^i|| a \j j ftdt, (7.622) 



f £ 2 (1 - ty dt, (7.623) 

Jo 

(7.624) 



1 
<t>\ dt, (7.625) 





' f t 2 {l-t) 2 {2t-l) 2 dt, (7.626) 

(7.627) 



^210 
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Numerical (-exact) solution overlaid 

on two-term spectral (Galerkin) approximation 



Difference between numerical (-exact) 

and two-term spectral (Galerkin) approximation 



y D (t)-y(t) 




Figure 7.15: Two-term spectral (Galerkin) estimate y p (t) and highly accurate numerical 
solution y(t); Error in approximation y p (t) — y(t). 



The approximate solution can then be rewritten as an orthonormal expansion: 



y P (t) 



760617 



(V30i(l - £)) 



3432 



1415794 V30 707897V210 

-0.981 (V30t(l - t)) -0.000335 (V2Wt(l - t)(2t - 1)) 



(V210t(l-t)(2t- 1)), 



Vi 



y"2 



(7.628) 
(7.629) 



Because the trial functions have been normalized, one can directly compare the coefficients' magnitude. 
It is seen that the bulk of the solution is captured by the first term. 



J 



I 

Example 7.54 

For the equation of the previous example, 



dt 2 



+ Vty = l, y(0) = 0, y(l) = 0, 



(7.630) 



examine the convergence rates for a collocation method as the number of modes becomes large. 

Let us consider a set of trial functions which do not happen to be orthogonal, but are, of course, 
linearly independent. Take 



b n (t)=t n (t-l), 



n = 1, 



JV. 



(7.631) 



So we seek to find a vector a = a n , n = 1, . . . , N, such that for a given number of collocation points N 
the approximation 

y N (t) = ai<f>i(t) + ... a n (j) n (t) + . . . + a N c/) N {t), (7.632) 

drives a weighted residual to zero. Obviously each these trial functions satisfies both boundary con- 
ditions, and they have the advantage of being easy to program for an arbitrary number of modes, as 
no Gram-Schmidt orthogonalization process is necessary. The details of the analysis are similar to 
those of the previous example, except we perform it many times, varying the number of nodes in each 
calculation. For the collocation method, we take the weighting functions to be 



il> n {t) = 5{t-t n ), n=l,..., AT. 



(7.633) 
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Figure 7.16: Error in solution y^it) — VN max {t) as a function of number of collocation points 
N demonstrating exponential convergence for the spectral-type collocation method. 



Here we choose t n = n/(N + 1), n = 1, . . . , N, so that the collocation points are evenly distributed in 
t G [0, 1]. We then form the matrix 



/ <V>i,L0i>, <4' 1 ,L4> 2 > 

<V>2,L01>, <^ 2 ,L02> 



and the vector 



and then solve for a in 




<ipi,L<f> N > \ 

<1p 2 ,^N> 



<ip N ,L(f> N >J 



(7.634) 



A ■ a = b. 



(7.635) 



(7.636) 



We then perform this calculation for N = 1, . . . , N max . We consider N = N max to give the most exact 
solution and calculate an error by finding the norm of the difference of the solution for N < N max and 



that at N = N„ 



ll»J>r(t)-yiv mo .(*)||s 



(yjv(*) - yN ma Jt)) dt. 



(7.637) 



A plot of the error e^ is plotted as a function of N in Fig. 17.161 We notice even on a logarithmic 
plot that the error reduction is accelerating as the number of nodes N increases. If the slope had 
relaxed to a constant, then the convergence would be a power law convergence; which is characteristic 
of finite difference and finite element methods. For this example of the method of weighted residuals, 
we see that the rate of convergence increases as the number of nodes increases, which is characteristic 
of exponential convergence. For exponential convergence, we have e^r ~ exp(— aN), where a is some 
positive constant; for power law convergence, we have epj ~ N~° where /3 is some positive constant. 
At the highest value of N, N = N max = 10, we have a local convergence rate of 0(N~ 219 ) which is 
remarkably fast. In comparison, a second order finite difference technique will converge at a rate of 
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0(N 2 ). In general and if possible one would choose a method with the fastest convergence rate, all 
else being equal. 

I 



7.7 Uncertainty quantification via polynomial chaos 

The methods of this chapter can be applied to account for how potential uncertainties present 
in model parameters affect the solutions of differential equations. To study this, we will 
introduce a stochastic nature into our parameters. There are many ways to deal with these 
so-called stochastic differential equations. One important method is known variously as 
"polynomial chaos," "Wienecj-Aske}0 chaos," as well as other names. The term "chaos" 
in this context was introduced by Wiener; it is in no way connected to the more modern 
interpretation of chaos from non-linear dynamics, as will be considered in Sec. 19.11.31 
Polynomial chaos is relevant, for example, to a differential equation of the form 

^- = f(y;k), y(0) = y o , (7.638) 

at 

where k is a parameter. For an individual calculation, k is a fixed constant. But because k 
is taken to possess an intrinsic uncertainty, it is allowed to take on a slightly different value 
for the next calculation. We expect a solution of the form y = y(t; k); that is, the effect of 
the parameter will be realized in the solution. One way to handle the uncertainty in k is to 
examine a large number of solutions, each for a different value of k. The values chosen for k 
are driven by its uncertainty distribution, assumed to be known. We thus see how uncertain 
k is manifested in the solution y. This is known as the Monte Carlo method; it is an effective 
strategy, although potentially expensive. 

For many problems, we can more easily quantify the uncertainty of the output y by 
propagating the known uncertainty of k via polynomial chaos, which has as its foundation 
notions from linear analysis. The method has the advantage of being a fully deterministic 
way to account for stochastic effects in differential equations. There are many variants on this 
method; we shall focus only on one canonical linear example which illustrates key aspects of 
the technique for ordinary differential equations. The method can be extended to algebraic 
and partial differential equations, both for scalar equations as well as for systems. 



I 

Example 7.55 

Given that 

^ = -ky, y(0) = 1, (7.639) 



24 Norbert Wiener, 1894-1964, American mathematician. 
25 Richard Askey, 1933-, American mathematician. 
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and that k has an associated uncertainty, such that 

k = ll + cr£, (7.640) 

where fi and a are known constants, and £ G (-co, oo) is a random variable with a Gaussian distribution 
about a mean of zero with standard deviation of unity, find a two-term estimate of the behavior of y(t) 
which accounts for the variation in k. 

For our k = /z + cr£, the mean value of k can be easily shown to be be /i, and the standard deviation 
of k is a. The solution to to Eq. (|T.639|) is 

y = e - kt = e -("+«*, (7.641) 

and will have different values, depending on the value k possess for that calculation. If there is no 
uncertainty in k, i.e. a = 0, the solution to Eq. (|7.639[) is obviously 

y = e~^. (7.642) 

Let us now try to account for the uncertainty in k in predicting the behavior of y when a ^ 0. Let 
us imagine that k has an TV + 1-term Fourier expansion of 

JV 

MO = $>„</>„(£)■ (7.643) 

n=0 

where 4> n (£) ar e a known set of basis functions. Now as the random input £ is varied, k will vary. And 
we expect the output y(t) to vary, so we can imagine that we really seek y{t, k) = y(t, &(£))• Dispensing 
with k in favor of £, we can actually seek y(t,£). Let us assume that y(£, £) has a similar Fourier 
expansion, 

JV 

V(*.O = £V~(*)0»(0, (7-644) 

n=0 

where we have also employed a separation of variables technique, with </) n (£), n = 0, . . . ,N as a set of 
basis functions and y n (t) as the time-dependent amplitude of each basis function. Let us choose the 
basis functions to be orthogonal: 

<MO,MO>=0, n^m. (7.645) 

Since the domain of £ is doubly infinite, a good choice for the basis functions is the Hermite polynomials; 
following standard practice, we choose the probabilists' form, <2> n (0 = He n (^), Sec. 15.1.4.21 recalling 
Heo{C) = 1, -ffei(£) = £, He2(C) = — 1 + £ 2 , ■■■■ Note that other non-Gaussian distributions of 
parametric uncertainty can render other basis functions to be better choices. 
When we equip the inner product with the weighting function 

w(£) = —=e- e/2 , (7.646) 

V27T 

we find our chosen basis functions are orthogonal: 

<0n(O^m(O> = r M04>m(ZMZ)dZ, (7.647) 

J —OQ 

/OO i 

#e„(0#em(0^re" 42/2 ^, (7.648) 

-oo V27T 

= n!<W (7.649) 
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Let us first find the coefficients a n in the Fourier-Hermite expansion of fc(£): 

N 

n=0 

<^m(Oi*(0> = «l>m(0,J2an(i>n(Q>, 

n=0 
oo 



n=0 

JV 



/ ^ ^n^"^mn: 



n=0 

m\a m , 



o, ( 



«MQ,fc(0> 

n! 



2?™! 



ffe„(0(M + ^)e-« 2/2 ^ 



(7.650) 
(7.651) 
(7.652) 

(7.653) 

(7.654) 
(7.655) 

(7.656) 

(7.657) 



Because of the polynomial nature of k = fi + <y£ and the orthogonality of the polynomial functions, 
there are only two non-zero terms in the expansion: ao = jJ> and a\ = a; thus, 



HO = M + ^ = a He (0 + ai H ei (Z) = M#e (£) + *H ei {S). 



(7.658) 



So for this simple distribution of fc(£), the infinite Fourier series expansion of Eq. (|7.644p is a finite 
two-term expansion. We actually could have seen this by inspection, but it was useful to go through 
the formal exercise. 

Now, substitute the expansions of Eqs. (|7.6431 [7.644|) into the governing Eq. (|7.639|) : 



d 
~dt 



\ 


t 


N 




^2/n (*)</>„ (0 


= - 


n=0 




' " ' 

y J 


\ 




(7.659) 



Equation (|7.659[) forms N + 1 ordinary differential equations, still with an explicit dependency on £. 
We need an initial condition for each of them. The initial condition j/(0) = 1 can be recast as 



iV 



1/(0,0 = l = £tf»(O)0n(O- 



(7.660) 



71=0 



Now we could go through the same formal exercise as for k to determine the Fourier expansion of 
y(0) = 1. But since </>o(f) = 1, we see by inspection the set of N + 1 initial conditions are 



yo(0) = 1, i/i(0) = 0, ya(0) = 0, 
Let us now rearrange Eq. (|7.659[) to get 



y N (0) = 0. 



N 



N N 



n=l m=l 



^2 ~~dt L< t >n ^ = ~^2^2 a «y™( i )<M0 < / , m(£)- 

n=0 
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We still need to remove the explicit dependency on the random variable £. To achieve this, we will 
take the inner product of Eq. (|7.662[) with a set of functions, so as to simplify the system into a cleaner 
system of ordinary differential equations. Let us choose to invoke a Galerkin procedure by taking the 
inner product of Eq. (|7.662j) with </>;(£): 

N N N 

<&(£), £"J^»(0> = -<0l(O>££«»ym(t)0n(O0m(0>» (7-663) 

n— n — 1 m—1 

N N N 

52-£<Mt),MO> = -EE tt »faW<*K)-iK)4.K)>- (7.664) 

n— n— 1 m—1 

N N 



<MO,MO> = -£]Ca n ym(i)<&(0.&»(OMO>> (7-665) 



dy; 
di 

dy* 1 



n— 1 m—1 



di 

dyi 

dt 



<MO,MO> ~ ~ 

r \-5/i r v^./ n= l m —l 



Y, E «»«m(*)<^(0,^n(0^m(0>. ( 7 -666) 



N N 



ji^2^2<Xny m (t)«i>l(0,<i>n(0<t>m(Z)>, l = 0,...,N. (7.667) 



n— 1 m—1 



Equation (|7. 667ft forms AT + 1 ordinary differential equations, with N + 1 initial conditions provided by 
Eq. (|7.661[) . All dependency on £ is removed by explicit evaluation of the inner products for all I, n, 
and m. We could have arrived at an analogous system of ordinary differential equations had we chosen 
any of the other standard set of functions for the inner product. For example, Dirac delta functions 
would have led to a collocation method. Note the full expression of the unusual inner product which 
appears in Eq. (|7.667j) is 

/OO -I 

0j(O0»(O0m(O-7=e-* /2 d£- (7-668) 

-oo V27T 

This relation can be reduced further, but it is not straightforward. 

When N = 1, we have a two-term series, with I = 0,1. Detailed evaluation of all inner products 
yields two ordinary differential equations: 

-tt = -VVo-vyi, yo(0) = 1, (7.669) 

at 

-£ = -<ryo - m, 2/i(0)=0. (7.670) 

Note when a = 0, ya(t) = e _Alt , yi(t) = 0, and we recover our original non-stochastic result. For 
a 7^ 0, this linear system can be solved exactly using methods of the upcoming Section 19.5.11 Direct 
substitution reveals that the solution is in fact 

yo (t) = e'^ cosh, (at), (7.671) 

yi {t) = -e-^sinh (at). (7.672) 

Thus, the two-term approximation is 

y(t,Z) ~ vo(t)MZ) + vi{t)MZ), ( 7 -673) 

= e _/i * (cosh (at) - sinh (at) £) . (7.674) 

The non-stochastic solution e _Alt is obviously modulated by the uncertainty. Even when £ = 0, there 
is a weak modulation by cosli(cri) ~ 1 + a 2 t 2 / + a t /24 + . . .. 
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Standard probability theory lets us estimate the mean value of y(£, £)i which we call y(t), over a 
range of normally distributed values of £: 

/OO -i 

y(t,£)-=e-e '*<%, (7.675) 

-oo V 2n 

e _/i * (cosh (at) - sinh (at) £) -=e~? /2 dZ, (7.676) 

, V27T 



= e-^cosh(frt). (7.677) 

Thus, the mean value of y is yo(t) = e _M * cosh(cri). The standard deviation of the solution, a s (t) is 
found by a similar process 



*.(t) \l I („(*,£) -y(t)) 2 -Le-«V 2 d£, (7.678) 



(-e~^ sinh (at) £) 2 -=e-Z 2 / 2 d£, (7.679) 

-oo V27T 

e- M *sinh(CTi). (7.680) 

Note that a s (t) = \y\(t)\. Also note that a s is distinct from a, the standard deviation of the parameter 
k. 

All of this is easily verified by direct calculation. If we take fi = 1 and a = 1/3, we have k = l+£/3, 
recalling that £ is a random number, with a Gaussian distribution with unity standard deviation about 
zero. Let us examine various predictions at t = 1. Ignoring all stochastic effects, we might naively 
predict that the expected value of y should be 

y (t = 1) = e _AI * = e _(1)(1) = 0.367879, (7.681) 

with no standard deviation. However, if we execute so-called Monte Carlo simulations where k is varied 
through its range, calculate y at t = 1 for each realization of k, and then take the mean value of all 
predictions, we find for 10 6 simulations that the mean value is 

yMonte Carlo(t = 1) = 0.388856. (7.682) 

This number will slightly change if a different set of random values of k are tested. Remarkably though, 
yMonte Carlo is ^vell predicted by our polynomial chaos estimate of 



yo (t = 1) = e-^ cosh(at) = e" (1)(1) cosh ( - J 



0.388507. (7.683) 



We could further improve the Monte Carlo estimate by taking more samples. We could further improve 
the polynomial chaos estimate by including more terms in the expansion. As the number of Monte 
Carlo estimates and the number terms in the polynomial chaos expansion approached infinity, the two 
estimates would converge. And they would converge to a number different than that of the naive 
estimate. The exponential function warped the effect of the Gaussian distributed k such that the 
realization of y at t = 1 was distorted to a greater value. 

For the same 10 6 simulations, the Monte Carlo method predicts a standard deviation of y at t = 1 
of 

Cs, Monte Carlo = 0.133325. (7.684) 

This number is well estimated by the magnitude, \yi(t = 1)|: 

1^(2=1)1 = e" M * sinh(crf) = e" (1)(1) sinh ( - j =0.124910. (7.685) 
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Figure 7.17: Histograms for distribution of k and y(t 
simulations for various values of k. 



0.5 1 .0 1 .5 

y(t=1;k=1+|/3) 

6 



1; jfe = 1 + f /3) for 10 6 Monte Carlo 




Figure 7.18: Estimates of y(t) which satisfies dy/dt = —ky, y(0) 
£ is a random variable, normally distributed about zero. 



3.0 t 

1, for k = 1 +£/3, where 



Again, both estimates could be improved by more samples, and more terms in the expansion, respec- 
tively. 

Histograms of the scaled frequency of occurrence of k and y(t = 1; k = 1 + £/3) within bins of 
specified width from the Monte Carlo method for 10 6 realizations are plotted in Fig. 17.171 We show 
fifty bins within which the scaled number of occurrences of k and y(t = 1) are realized. The scaling 
factor applied to the number of occurrences was selected so that the area under the curve is unity. 
This is achieved by scaling the number of occurrences within a bin by the product of the total number 
of occurrences and the bin width; this allows the scaled number of occurrences to be thought of as a 
probability density. As designed, k appears symmetric about its mean value of unity, with a standard 
deviation of 1/3. Detailed analysis would reveal that k in fact has a Gaussian distribution. But 
y(t = l;fc = 1 + £/3) does not have a Gaussian distribution about its mean, as it has been skewed 
by the dynamics of the differential equation for y. The time-evolution of y is plotted in Fig. 17.181 
The black line gives the naive estimate, e - *. The green line gives yo(t), and the two blue lines give 
yo(t) ± yi(t), that is, the mean value of y, plus or minus one standard deviation. 
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Problems 

1. Use a one-term collocation method with a polynomial basis function to find an approximation for 

y"" + (l + x)y = 1, 

with 2/(0) = j/(0) = 2/(1) = 2/"(l) = 0. 

2. Use two-term spectral, collocation, subdomain, least squares and moments methods to solve the 
equation 

y"" + (1 + x)y = 1, 

with y(0) = 2/(0) = y(l) = 2/"(l) = 0. Compare graphically with the exact solution. 

3. If xi, X2, ■ ■ ■ , xn and 2/1, 2/2, ■ • • , Vn are real numbers, show that 

/ N \ 2 / N \ / N \ 



Y, x "Vn < E 1 " E y'n 



\n — 1 / \n— 1 / \?i— 1 / 

4. If x, y G X, an inner product space, and x is orthogonal to y, then show that ||x + ay 1 1 = \\ x ~ a ll\ 
where a is a scalar. 

5. For an inner product space, show that 

<x,y + z> = <x, y> + <x, z>, 
<ax,y> = a<x,y>, 
<x, y> = <y, x> in a real vector space. 

6. The linear operator A : X — > Y, where X = K 2 , Y = M. 2 . The norms in X and Y are defined by 

x = (6,6) T eX,||x|| 00 =max(|6|,|6l), 

y = {m,m) T e Y , II2/II1 = \m\ + \m\- 
Find ||A|| if A= (^ I ~_\ 

7. Let Q, C and K be the sets of all rational, complex and real numbers respectively. For the following 
determine if A is a vector space over the field F. For finite-dimensional vector spaces, find also a set 
of basis vectors. 

(a) A is the set of all polynomials which are all exactly of degree n, F = M. 

(b) A is the set of all functions with continuous second derivatives over the interval [0, L] and 
satisfying the differential equation y" + 2y' + y = 0, F = K. 

(c) A = R,F = R. 

(d) A = {(ai, 02, 03) such that 01, 02 € Q, 2a± + 0,2 = 403}, F = Q. 

(e) A = C,F = Q. 

(f) A = {ae x + be~ 2x such that a,beR,x £ [0, 1]}, F = R. 
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8. Which of the following subsets of R 3 constitute a subspace of M 3 where x = (x\, £2, £3) € K 3 : 

(a) All x with x\ = xi and X3 = 0. 

(b) All x with x\ = x% + 1. 

(c) All a; with positive £1,0:2, X3. 

(d) All a; with xi — xi + x% = constant k. 

9. Given a set S of linearly independent vectors in a vector space V, show that any subset of § is also 
linearly independent. 

10. Do the following vectors, (3, 1, 4, -1) T , (1, -4, 0, 4) T , (-1, 2, 2, 1) T , (-1, 9, 5, -6) T , form a basis in K 4 ? 

11. Given x\, the iterative procedure x n +i = La; ra generates X2 , £3 , 2:4 , • ■ ■, where L is a linear operator 
and all the a:'s belong to a complete normed space. Show that {x n , n = 1, 2, ■ • ■} is a Cauchy sequence 
if ||L|| < 1. Does it converge? If so find the limit. 

12. If {e„, n = 1, 2, • • •} is an orthonormal set in a Hilbert space H, show that for every x G H, the vector 
y = X^ I i=i^ a; > e « >e ™ exists in H, and that x — y is orthogonal to every e n . 

13. Let the linear operator A : C 2 — > C 2 be represented by the matrix A = I I . Find ||A|| if all 
vectors in the domain and range are within a Hilbert space. 

14. Let the linear operator A : C 2 — > C 2 be represented by the matrix A = I I . Find ||A|| 



if all vectors in the domain and range are within a Hilbert space. 

Using the inner product 
Sturm-Liouville operator 



15. Using the inner product (a;, y) = J w(t)x{t)y(t) dt, where w(t) > for a < t < b, show that the 



r iUt)i)+rtt) 



w(t) \dt V dt, 
with ax(a) + fix' {a) = 0, and jx(b) + 5x' '(b) = is self-adjoint. 
16. For elements x, y and z of an inner product space, prove the Apolloniua 26 ! identity: 

2 



\ z -x\& + \\z-y\? 2 = \\\x-y\& + 2 



2- -jKX + y) 



17. If x, j/£Xan inner product space, and x is orthogonal to y, then show that \\x + ay\\2 = \\x — ay\\2 
where a is a scalar. 

18. Using the Gram-Schmidt procedure, find the first three members of the orthonormal set belonging to 
L2(— 00, 00), using the basis functions {exp(— 1 2 /2), iexp(— i 2 /2), t 2 exp(— 1 2 /2), ■ ■ •}. You may need 
the following definite integral 

/oo 
exp(-t 2 /2) dt = \ / 2tt. 
-00 

19. Let C(0,1) be the space of all continuous functions in (0,1) with the norm 



1 

2 



\f(t)\ 2 dt. 



26 Apollonius of Perga, ca. 262 BC-ca. 190 BC, Greek astronomer and geometer. 
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Show that 

(A-f 2 " tn+1 for ° - * < 5 
M j ~ I 1 - 2™(1 - t) n+1 for i < i < 1, 

belongs to C(0,1). Show also that {/n,«- = 1, •••} is a Cauchy sequence, and that C(0,1) is not 
complete. 

20. Find the first three terms of the Fourier-Legendre series for f(x) = cos(7ra;/2) for x G [—1,1]. Compare 
graphically with exact function. 

21. Find the first three terms of the Fourier-Legendre series for 

-1, for - 1 < x < 0, 



22. Consider 



^ X ' 1, for < x < 1. 



±y +2 t 3 y = l-t, y(0) = y(2) = ^(0) = 0. 



Choosing polynomials as the basis functions, use a Galerkin and moments method to obtain a two- 
term estimate to y{t). Plot your approximations and the exact solution on a single curve. Plot the 
residual in both methods for t G [0, 2] 

23. Solve 

x" + 2xx' +t = 0, 

with x(0) = 0, x(4) = 0, approximately using a two-term weighted residual method where the basis 
functions are of the type sin At. Do both a spectral (as a consequence Galerkin) and pseudospectral 
(as a consequence collocation) method. Plot your approximations and the exact solution on a single 
curve. Plot the residual in both methods for x G [0, 4]. 

24. Show that the set of solutions of the linear equations 

Xi + 3x 2 + x 3 - X4 = 0, 

— 2cEl + 2^2 — 2^3 + Xa = 0, 

form a vector space. Find the dimension and a set of basis vectors. 

25. Let 




For A : K 3 — > M 3 , find ||A|| if the norm of x = (xi, x 2 , x 3 ) T G R 3 is given by 

Hxlloo = max(|xi|,|x 2 |,|x3|). 
26. For any complete orthonormal set {4n, i=l,2,---}ina Hilbert space H, show that 

U = } y <U, <j>j><f>j, 
i 

<u : v> = y<u, 0*><*>, <£i>, 

i 

\\u\\l = ^|< U ,^>| 2 , 

i 

where u and v belong to H. 
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27. Show that the set P 4 [0, 1] of all polynomials of degree 4 or less in the interval < x < 1 is a vector 
space. What is the dimension of this space? 

28. Show that 

(x\+xl + ... + x 2 N )(y\ + y\ + . . . + y 2 N ) > (xiyi + x 2 y2 + ■■■ + xnvn) 2 , 

where Xi, x 2 , ■ . . , Xn, Vi, 2/2, • ■ • , Vn are real numbers. 

29. Show that the functions e\{t), e 2 (t), . . . , ejv(i) are orthogonal in L 2 (0, 1], where 

f 1 n=l < t < — 
| otherwise. 

Expand i 2 in terms of these functions. 

30. Find one-term collocation approximations for all solutions of 

d 2 y 



dx 

with y(0) = 0, 2/(1) = 0. 
31. Show that 



2+2/ =1- 



(f(x)+g(x)) 2 dx<J J (f(x)) 2 dx + x l I (g(x)f 

where f(x) and y{x) belong to L2[a, 6]. 

32. Find the eigenvalues and eigenfunctions of the operator 

d?_ d_ 

dx 2 dx 

which operates on functions y G L2[0, 5] that vanish at x = and x = 5. 

33. Find the supremum and infimum of the set S = {l/«, where n = 1, 2, • ■ •}. 

34. Find the L2[0, 1] norm of the function f(x) = x + I. 

35. Find the distance between the functions x and x 3 under the L>2[0, 1] norm. 

36. Find the inner product of the functions x and x 3 using the L2[0,l] definition. 

37. Find the Green's function for the problem 

d 2 x 

— =- + k x = fit), with x(0) = a, x(tt) = b. 

dt z 

Write the solution of the differential equation in terms of this function. 

38. Find the first three terms of the Fourier-Legendre series for 

-2 for - 1 < x < 



■^ 1 for < x < 1 



Graph f(x) and its approximation. 
39. Find the null space of 
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(a) the matrix operator 



(b) the differential operator 

T d2 ,2 

40. Test the positive definiteness of a diagonal matrix with positive real numbers on the diagonal. 

41. Let S be a subspace of L2[0, 1] such that for every i£§, a;(0) = 0, and x(0) = 1. Find the eigenvalues 
and eigenfunctions of L = —d 2 /dt 2 operating on elements of S. 



42. Show that 



for a G {a, (3), where 



lim / f(x)A e (x — a)dx = /(a), 

e— >0 



{0, if x < a — |, 
\, ifa— |<a;<a + |, 
0, if x > a+ |. 



43. Consider functions of two variables in a domain Q with the inner product defined as 



<u,v> = u(x,y)v(x,y) dx dy. 

Find the space of functions such that the Laplacian operator is self-adjoint. 

44. Find the eigenvalues and eigenfunctions of the operator L where 

t t^ + i\ d2 y Jy 

with t G [— 1, 1] and y(—l) = 2/(1) = 0. Show that there exists a weight function r(x) such that the 
eigenfunctions are orthogonal in [—1, 1] with respect to it. 

45. Show that the eigenvalues of an operator and its adjoint are complex conjugates of each other. 

46. Using an eigenvector expansion, find the general solution of A • x = y where 




47. Show graphically that the Fourier trigonometric series representation of the function 

-1, if — 7T < i < 0, 



J® 1, if < i < 7T, 

always has an overshoot near x = 0, however many terms one takes (Gibbs phenomenon). Estimate 
the overshoot. 
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48. Let {ei, • • ■ ,ejy} be an ortho-normal set in an inner product space S. Approximate x € § by y = 
/3iei + ■ • • + (3nSn, where the /3's are to be selected. Show that ||cc — y\\ is a minimum if we choose 
j3i = <x,ei>. 

49. (a) Starting with a vector in the direction (1, 2, 0) T use the Gram-Schmidt procedure to find a set of 
orthonormal vectors in R 3 . Using these vectors, construct (b) an orthogonal matrix Q, and then find 
(c) the angles between x and Q ■ x, where x is (1,0, 0) T , (0, 1,0) T and (0,0, 1) T , respectively. The 
orthogonal matrix Q is defined as a matrix having orthonormal vectors in its columns. 

50. Find the null space of the operator L defined by Lx = {d 2 /dt 2 )x(t). Also find the eigenvalues and 
eigenfunctions (in terms of real functions) of L with x(0) = 1, (dx/dt)(0) = 0. 

51. Find all approximate solutions of the boundary value problem 



fy 

dx 2 



-r~2+y + 5 y = -x, 



with y(0) = j/(l) = using a two-term collocation method. Compare graphically with the exact 
solution determined by numerical methods. 

52. Find a one-term approximation for the boundary value problem 

y" -y = -x 3 , 

with 7/(0) = 2/(1) = 0, using the collocation, Galerkin, least-squares, and moments methods. Compare 
graphically with the exact solution. 

53. Consider the sequence { " } in M. N . Show that this is a Cauchy sequence. Does it converge? 

54. Prove that (L a L;,) = L^L* when L a and L& are linear operators which operate on vectors in a Hilbert 
space. 

55. If {xi} is a sequence in an inner product space such that the series ||xi|| + ||x2|| + • • • converges, show 
that {sn} is a Cauchy sequence, where sn = x± + X2 + ■ ■ ■ + xjv- 

56. If L(x) = ao(£)^r + a i{t)^§ + a 2{t)x, find the operator that is formally adjoint to it. 

57. If 

y(t) = L(x(t)) = f x(t) dr, 
Jo 

where y(t) and x(t) are real functions in some properly defined space, find the eigenvalues and eigen- 
functions of the operator L. 

58. Using a dual basis, expand the vector (1, 3, 2) T in terms of the basis vectors (1, 1, 1) T , (1, 0, — 1) T , and 
(1, 0, \) T in K 3 . The inner product is defined as usual. 

59. With f 1 (x) = l + i + x and f 2 {x) = 1 + ix + ix 2 , 

a) Find the La[0, 1] norms of fi(x) and f2{x). 

b) Find the inner product of /i(x) and J2{x) under the L2[0, 1] norm. 

c) Find the "distance" between fi(x) and /2(x) under the L,2[0, 1] norm. 

60. Show the vectors ui = (-?', 0, 2, 1 + i) T , u 2 = (1, 2, i, 3) T , u 3 = (3 + i, 3 - i, 0, -2) T , u 4 = (1, 0, 1, 3) T 
form a basis in C 4 . Find the set of reciprocal basis vectors. For x G C 4 , and x = (i,3 — i,—2,2) T , 
express x as an expansion in the above-defined basis vectors. That is find at such that x = a^. 
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61. The following norms can be used in K w , where x = (£i, • • • , £n) G R N . 
(a) IMU = max^^jv |£„|, 

(b)Nii = Eliia 

(c)IM| 2 = (EliK»l a ) 1/2 , 

(d)|N| p = (El 1 len| P ) 1/p , 1<P<00. 

Show by examples that these are all valid norms. 

62. Show that the set of all matrices A : l w — > 1* is a vector space under the usual rules of matrix 
manipulation. 

63. Show that if A is a linear operator such that 
(a) A : (KM| ■ |U) -> (RM| • HO, then ||A|| = E*=i M. 



(b) A : (W N ,\\ ■ IU) -» (RM| ■ |U), then ||A|| = maxK.^EL^. 



64. If 



dx 2 dx 



Lu = a{x)-j-^ + b(x)— + c(x)u, 



show 



— ( )- — 
dx 2 dx 



Li*u = — J (aw) — —(bu) 



65. Consider the function x(£) = sin(4i) for £ G [0, 1]. Project x(t) onto the space spanned by the functions 
u m (t) so as to find the coefficients a m , where x(t) ~ x p (t) = Em— l a rnU m (t) when the basis functions 
are 

(a) M = 2; ui{t) = t, u 2 {t) = t 2 . 

(b) M = 3; ui(t) = 1, u 2 (t) = ^ 2 , "3(0 = tani. 

In each case plot x(t) and its approximation on the same plot. 

66. Project the vector x = (1,2,3,4) T onto the space spanned by the vectors, Ui t v,2, so as to find the 
projection x ~ x p = aiUi + a 2 u 2 . 

/1\ 

(a) mi = 



(b)«i 



67. Show the vectors ui = (— i, 3, 1— i, l+i) T , u 2 = (i+1, 2, i, 3) T , u^ = (3-M, 3— i, 0, — 2) T , u^ = (1, 0, 2, 3) T 
form a basis in C 4 . Find the set of reciprocal basis vectors. For x G C 4 , and x = (i, 3 — i, — 5, 2 + i) T , 

(a) express x as an expansion in the above-defined basis vectors. That is find a, such that x = 

(b) project onto the space spanned by u\, u 2 , and u^. That is find the best set of at such that 
~ — \^ 3 

X — Xp — Z-/i=l *-^«^i- 

68. Consider d 2 y/dt 2 = — ky, y(0) = 1, dy/dt(0) = 0. With £ G (—00,00) a random normally distributed 
variable with mean of zero and standard deviation of unity, consider k = /i + c£. Use the method 
of polynomial chaos to get a two-term estimate for y when fi = 1, a = 1/10. Compare the expected 
value of y(t = 10) with that of a Monte Carlo simulation and that when a = 0. 




ICC BY-NC-THJ} 29 July 2012, Sen & Powers. 



Chapter 8 
Linear algebra 



see Kaplan, Chapter 1, 

see Lopez, Chapters 33, 34, 

see Riley, Hobson, and Bence, Chapter 7, 

see Michel and Herget, 

see \GoluU and Wan Loan\ 

see\Strang\ Linear Algebra and its Applications, 



see Strang, Introduction to Applied Mathematics. 

The key problem in linear algebra is addressing the equation 

Ax = b, (8.1) 

where A is a known constant rectangular matrix, b is a known column vector, and x is 
an unknown column vector. In this chapter, we will more often consider A to be an alibi 
transformation in which the coordinate axes remain fixed, though occasionally we revert to 
alias transformations. To explicitly indicate the dimension of the matrices and vectors, we 
sometimes write this in expanded form: 

Ajvxm • xmxi = bjvxij (8-2) 

where N, M G N are the positive integers which give the dimensions. If N = M , the matrix 
is square, and solution techniques are usually straightforward. For TV ^ M } which arises 
often in physical problems, the issues are not as straightforward. In some cases we find an 
infinite number of solutions; in others we find none. Relaxing our equality constraint, we 
can, however, always find a vector x p 

x p = x such that ||A • x — b|| 2 — > min. (8-3) 

This vector x p is the best solution to the equation A • x = b, for cases in which there is no 
exact solution. Depending on the problem, it may turn out that x p is not unique. It will 
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always be the case, however, that of all the vectors x p which minimize ||A • x — b|| 2 , that 
one of them, x, will itself have a minimum norm. We will define here the residual r as 

r = Ax-b. (8.4) 

In general, we will seek an x that minimizes ||r|| 2 . 



8.1 Determinants and rank 

We can take the determinant of a square matrix A, written det A. Details of computation of 
determinants are found in any standard reference and will not be repeated here. Properties 
of the determinant include 

• det Aatxjv is equal to the volume of a parallelepiped in iV- dimensional space whose 
edges are formed by the rows of A. 

• If all elements of a row (or column) are multiplied by a scalar, the determinant is also 
similarly multiplied. 

• The elementary operation of subtracting a multiple of one row from another leaves the 
determinant unchanged. 

• If two rows (or columns) of a matrix are interchanged the sign of the determinant 
changes. 

A singular matrix is one whose determinant is zero. The rank of a matrix is the size r of the 
largest square non-singular matrix that can be formed by deleting rows and columns. 

While the determinant is useful to some ends in linear algebra, most of the common 
problems are better solved without using the determinant at all; in fact it is probably a fair 
generalization to say that the determinant is less, rather than more, useful than imagined by 
many. It is useful in solving linear systems of equations of small dimension, but becomes much 
too cumbersome relative to other methods for commonly encountered large systems of linear 
algebraic equations. While it can be used to find the rank, there are also other more efficient 
means to calculate this. Further, while a zero value for the determinant almost always has 
significance, other values do not. Some matrices which are particularly ill-conditioned for 
certain problems often have a determinant which gives no clue as to difficulties which may 
arise. 
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8.2 Matrix algebra 



We will denote a matrix of size iVxMas 





/ an 


ai2 • 


• «i m 


■ a-iM \ 




«21 


022 ■ 


• 0-2m 


■ «2M 


AnxM — 


a n i 


«n2 - 


Q"nm 


• 0-nM 




V a>Ni 


dN2 ■ 


■ &Nm ■ 


■ &NM / 


Addition of matrices can be defi 


ned as 









1.5) 



<~NxM 



B 



NxM 



'NxM, 



(8.6) 



where the elements of C are obtained by adding the corresponding elements of A and B. 
Multiplication of a matrix by a scalar a can be defined as 



aA 



NxM 



B 



NxM, 



J.7) 



where the elements of B are the corresponding elements of A multiplied by a. 

It can be shown that the set of all N x M matrices is a vector space. We will also refer to 
an A?" x 1 matrix as an A-dimensional column vector. Likewise alxill matrix will be called 
an M-dimensional row vector. Unless otherwise stated vectors are assumed to be column 
vectors. In this sense the inner product of two vectors xtvxi and yyvxi is <x,y> = x T • y. 
In this chapter matrices will be represented by upper-case bold-faced letters, such as A, and 
vectors by lower-case bold-faced letters, such as x. 



8.2.1 Column, row, left and right null spaces 

The M column vectors c m e C N , m = 1,2,... , M, of the matrix A^ x m are each one of 
the columns of A. The column space is the subspace of C M spanned by the column vectors. 
The N row vectors r n G C M , n = 1, 2, . . . , N, of the same matrix are each one of the rows. 
The row space is the subspace of C^ spanned by the row vectors. The column space vectors 
and the row space vectors span spaces of the same dimension. Consequently, the column 
space and row space have the same dimension. The right null space is the set of all vectors 
xmxi £ C M for which Ajvxm ■ x^xi = Ojvxi- The left null space is the set of all vectors 
yjvxi e C N for which y^ xl • Ajvxm = Yixtv ' &nxm = OixM- 

If we have Atvxm : C M — > C , and recall that the rank of A is r, then we have the 
following important results: 



The column space of A^ x m has dimension r, (r < M). 
The left null space of An x m has dimension N — r. 
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• The row space of Ajvxm has dimension r, (r < N). 

• The right null space of A NxM has dimension M — r. 

We also can show 

C = column space © left null space, (8-8) 

C M = row space © right null space. (8.9) 

Also 

• Any vector x G C M can be written as a linear combination of vectors in the row space 
and the right null space. 

• Any M-dimensional vector x which is in the right null space of A is orthogonal to any 
M-dimensional vector in the row space. This comes directly from the definition of the 
right null space A • x = 0. 

• Any vector y G C^ can be written as the sum of vectors in the column space and the 
left null space. 

• Any iV-dimensional vector y which is in the left null space of A is orthogonal to any 
A^- dimensional vector in the column space. This comes directly from the definition of 
the left null space y T • A = T . 



I 

Example 8.1 

Find the column and row spaces of 



and their dimensions. 



I i 2 : 



Restricting ourselves to real vectors, we note first that in the equation A • x = b, A is an operator 
which maps three-dimensional real vectors x into vectors b which are elements of a two-dimensional 
real space, i.e. 



The column vectors are 



.11) 



j 

1 : 

2 ' 



\CC BY-NC-THJ} 29 July 2012, Sen & Powers. 



8.2. MATRIX ALGEBRA 327 



The column space consists of the vectors QiCi + a 2 c 2 + 013C3, where the a's are any scalars. Since only 
two of the Cj's are linearly independent, the dimension of the column space is also two. We can see this 
by looking at the sub-determinant 

det(j j)=l, (8.15) 

which indicates the rank, r = 2. Note that 

• ci + 2c 2 = c 3 . 

• The three column vectors thus lie in a single two-dimensional plane. 

• The three column vectors are thus said to span a two-dimensional subspace of R 3 . 
The two row vectors are 

n = (1 1 ) , (8.16) 

r 2 = (0 1 2 ) . (8.17) 

The row space consists of the vectors /3iri + /3 2 r 2 , where the /3's are any scalars. Since the two r^'s are 
linearly independent, the dimension of the row space is also two. That is the two row vectors are both 
three dimensional, but span a two-dimensional subspace. 

We note for instance, if x = (1, 2, 1) T , that A • x = b gives 

So 

b = lci+2c 2 + lc 3 . (8.19) 

That is b is a linear combination of the column space vectors and thus lies in the column space of A. 
We note for this problem that since an arbitrary b is two-dimensional and the dimension of the column 
space is two, that we can represent an arbitrary b as some linear combination of the column space 
vectors. For example, we can also say that b = 2ci + 4c 2 . We also note that x in general does not 
lie in the row space of A, since x is an arbitrary three-dimensional vector, and we only have enough 
row vectors to span a two-dimensional subspace (i.e. a plane embedded in a three-dimensional space). 
However, as will be seen, x does lie in the space defined by the combination of the row space of A, and 
the right null space of A (the set of vectors x for which A • x = 0). In special cases, x will in fact lie 
in the row space of A. 

I 




8.2.2 Matrix multiplication 

Multiplication of matrices A and B can be denned if they are of the proper sizes. Thus 

Ajvxl • Bl x m = Cnxm • (8.20) 

It may be better to say here that A is a linear operator which operates on elements which are 
in a space of dimension L x M so as to generate elements which are in a space of dimension 
TV x M\ that is, A : R L x R M -^ R N x R M . 
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1 

Example 8.2 












Consider the matrix operator 
















A = 


(-3 


2 n 

3 1; 


1. 


which operates on 3 x 4 matrices, 


i.e. 












A 


:R 3 


xR 4 - 


^R 2 x 


R 4 



.211 



.22) 



and show how it acts on another matrix. 

We can use A to operate on a 3 x 4 matrix as follows: 



-> • u ~: I ?)-(* -- "'■ 



Note the operation does not exist if the order is reversed. 



J 



A vector operating on a vector can yield a scalar or a matrix, depending on the order of 
operation. 



I 

Example 8.3 

Consider the vector operations Ai X 3 • £$3x1 and B3 X i ■ Ai X 3 where 



A lx3 a J =(2 3 1), (8.24) 

I -2 . (8.25) 



Then 

Ai x3 -B 3x i = a T -b=(2 3 1) -2 I 

This is the ordinary inner product <a, b>. The commutation of this operation however yields a matrix: 

3 \ / (3)(2) (3)(3) (3)(1) 

B 3x i-A lx3 = ba T == ( -2 (2 3 1) = (-2)(2) (-2)(3) (-2)(1) | , (8.27) 

5 J V (5)(2) (5)(3) (5)(1) 

6 9 3 

-4 -6 -2 ) . (8.28) 

10 15 5 

This is the dyadic product of the two vectors. Note that for vector (lower case notation) the dyadic 
product usually is not characterized by the "dot" operator that we use for the vector inner product. 
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A special case is that of a square matrix Ajvxiv of size N. For square matrices of the 
same size both A • B and B • A exist. While A • B and B • A both yield N x N matrices, 
the actual value of the two products is different. In what follows, we will often assume that 
we are dealing with square matrices. 
Properties of matrices include 

1. (A • B) • C = A • (B • C) (associative), 

2. A • (B + C) = A • B + A • C (distributive), 

3. (A + B) C = A C + B C (distributive), 

4. A • B 7^ B • A in general (not commutative), 

5. detA-B = (detA)(detB). 

8.2.3 Definitions and properties 

8.2.3.1 Identity 

The identity matrix I is a square diagonal matrix with 1 on the main diagonal. With this 
definition, we get 



A-ATxM • ImxM - 


- AjvxM, 




(8.29) 


Ijvxjv ■ A. NxM = 


= Atvxjv/, 


or, more compactly, 


(8.30) 


A I = 


= I A = A, 




(8.31) 



where the unsubscripted identity matrix is understood to be square with the correct dimen- 
sion for matrix multiplication. 

8.2.3.2 Nilpotent 

A square matrix A is called nilpotent if there exists a positive integer n for which A" = 0. 

8.2.3.3 Idempotent 

A square matrix A is called idempotent if A • A = A. The identity matrix I is idempotent. 
Projection matrices P, see Eq. (j7.160p . are idempotent. All idempotent matrices which are 
not the identity matrix are singular. The trace of an idempotent matrix gives its rank. More 
generally, a function / is idempotent if f(f(x)) = f(x). As an example, the absolute value 
function is idempotent since abs(abs(x)) = abs(x). 
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8.2.3.4 Diagonal 

A diagonal matrix D has nonzero terms only along its main diagonal. The sum and product 
of diagonal matrices are also diagonal. The determinant of a diagonal matrix is the product 
of all diagonal elements. 

8.2.3.5 Transpose 

Here we expand on the earlier discussion of Sec. 16.2.31 The transpose A T of a matrix A is an 
operation in which the terms above and below the diagonal are interchanged. For any matrix 
Ajvxm, we find that A • A T and A T • A are square matrices of size TV and M, respectively. 
Properties of the transpose include 

1. detA = detA T , 

2. (Ajvxm • Ba/ x jv) = B • A , 

3. (A NxN ■ X7v x i) T • VAr x i = x T • A T • y = x T • (A T • y). 

8.2.3.6 Symmetry, ant i- symmetry, and asymmetry 

To reiterate the earlier discussion of Sec. 16.2.31 a symmetric matrix is one for which A T = A. 
An anti- symmetric or skew- symmetric matrix is one for which A T = —A. Any matrix A 
can be written as 

A = i(A + A T ) + i(A-A T ), (8.32) 

where (1/2) (A + A T ) is symmetric and (1/2) (A — A T ) is anti-symmetric. An asymmetric 
matrix is neither symmetric nor anti-symmetric. 

8.2.3.7 Triangular 

A lower (or upper) triangular matrix is one in which all entries above (or below) the main 
diagonal are zero. Lower triangular matrices are often denoted by L, and upper triangular 
matrices by either U or R. 

8.2.3.8 Positive definite 

A positive definite matrix A is a matrix for which x r • A • x > for all nonzero vectors x. A 
positive definite matrix has real, positive eigenvalues. Every positive definite matrix A can 
be written as A = U T -U, where U is an upper triangular matrix (Cholesk)o decomposition). 



1 after Andre-Louis Cholesky, 1875-1918, French mathematician and military officer. 
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8.2.3.9 Permutation 

A permutation matrix P is a square matrix composed of zeroes and a single one in each 
column. None of the ones occur in the same row. It effects a row exchange when it operates 
on a general matrix A. It is never singular, and is in fact its own inverse, P = P _1 , so 
P • P = I. Also ||P||2 = 1, and | detP| = 1. However, we can have detP = ±1, so it can be 
either a rotation or a reflection. 

The permutation matrix P is not to be confused with a projection matrix P, which is 
usually denoted in the same way. The context should be clear as to which matrix is intended. 



I 

Example 8.4 

Find the permutation matrix P which effects the exchange of the first and second rows of A, where 



.33) 




To construct P, we begin with at 3 x 3 identity matrix I. For a first and second row exchange, we 
replace the ones in the (1, 1) and (2, 2) slot with zero, then replace the zeroes in the (1, 2) and (2, 1) 
slot with ones. Thus 





.34) 



I 

Example 8.5 

Find the rank and right null space of 




(8.35) 



The rank of A is not three since 



Since 



the rank of A is 2. 
Let 



detA = 0. 



1 
5 4 



y^O, 




(8.36) 

(8.37) 

(8.38) 
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xi + x 3 = 


= 


<X\ + 4x 2 + 9x 3 = 


= 


X\ + 4x 2 + 6x 3 = 


= 
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belong to the right null space of A. Then 

(8.39) 
(8.40) 
(8.41) 

One strategy to solve singular systems is to take one of the variables to be a known parameter, and see 
if the resulting system can be solved. If the resulting system remains singular, take a second variable 
to be a second parameter. This ad hoc method will later be made systematic. 
So here take X\ = t, and consider the first two equations, which gives 

2i)(S)-U)- 

Solving, we find X2 = t, X3 = —t. So, 

/ t \ ( 1 \ 

t e K 1 . (8.43) 

Therefore, the right null space is the straight line in R 3 which passes through (0,0,0) and (1,1,-1). 

I 





8.2.3.10 Inverse 

Definition: A matrix A has an inverse A -1 if A • A -1 = A -1 • A = I. 

Theorem 

A unique inverse exists if the matrix is non- singular. 

Properties of the inverse include 

1. (A-B)" 1 = B~ 1 - A 1 , 

2. (A"T = (A T )"\ 

3. det(A" 1 ) = (detA)" 1 . 

If aij and a" 1 are the elements of A and A -1 , and we define the cofactor as 

cy = (-l) i+i my, (8.44) 

where the minor, mij is the determinant of the matrix obtained by canceling out the j-th 
row and i-th column, then the inverse is 

"«' ^ 55X (8 - 45) 

The inverse of a diagonal matrix is also diagonal, but with the reciprocals of the original 
diagonal elements. 
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I 

Example 8.6 

Find the inverse of 



The inverse is 



-1 1 



A" 1 = ( | ?). (8.47) 

2 2 



We can confirm that A • A 1 =A 1 ■ A = I. 



8.2.3.11 Similar matrices 

Matrices A and B are similar if there exists a non-singular matrix S such that B = S _1 - A-S. 
Similar matrices have the same determinant, eigenvalues, multiplicities and eigenvectors. 

8.2.4 Equations 

In general, for matrices that are not necessarily square, the equation Ajv x m ■ xjy./ x i = bAr xl 
is solvable iff b can be expressed as combinations of the columns of A. Problems in which 
M < N are over- constrained; in special cases, those in which b is in the column space of A, 
a unique solution x exists. However in general no solution x exists; nevertheless, one can 
find an x which will minimize ||A • x — b|| 2 . This is closely related to what is known as the 
method of least squares. Problems in which M > N are generally under- constrained, and 
have an infinite number of solutions x which will satisfy the original equation. Problems for 
which M = N (square matrices) have a unique solution x when the rank r of A is equal to 
N. If r < N, then the problem is under- constrained. 

8.2.4.1 Over-constrained systems 



I 

Example 8.7 

For xel 2 ,beM 3 ,A:t 2 ^ R 3 , consider 




(8.48) 



Here it turns out that b = (0,1, 3) T is not in the column space of A, and there is no solution x for 
which A ■ x = b! The column space is a plane defined by two vectors; the vector b does not happen to 
lie in the plane defined by the column space. However, we can find a solution x = x p , where x p can be 
shown to minimize the Euclidean norm of the residual | |A ■ x p — b| I2. This is achieved by the following 
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procedure, the same employed earlier in Sec. 17.3.2.61 in which we operate on both vectors A ■ x p and b 
by the operator A T so as to map both vectors into the same space, namely the row space of A. Once 
the vectors are in the same space, a unique inversion is possible. 

A-Xp ~ b, (8.49) 

A T • A • x p = A T ■ b, (8.50) 

Xp = (A T • A)" 1 • A T ■ b. (8.51) 

These operations are, numerically, 

'! 2 \ ^ N /n t 1N /0\ 

(8.52) 

(8.53) 

(8.54) 

Note the resulting x p will not satisfy A ■ x p = b. We can define the difference of A ■ x p and b as the 
residual vector, see Eq. (|8.4|) . r = A ■ x p — b. In fact, ||r||2 = ||A • x p — b 1 1 2 = 2.0412. If we tried 
any nearby x, say x = (2, —3/5) , ||A ■ x — b 1 1 2 = 2.0494 > 2.0412. Since the problem is linear, this 
minimum is global; if we take x = (10, -24) T , then ||A ■ x — b|| 2 = 42.5911 > 2.0412. Though we have 
not proved it, our x p is the unique vector which minimizes the Euclidean norm of the residual. 

Further manipulation shows that we can write our solution as a combination of vectors in the row 
space of A. As the dimension of the right null space of A is zero, there is no possible contribution from 
the right null space vectors. 




1 


2 \ 


f*i\ 




fi 1 


i\ 


? 


1 


i> 












1 


J 


w 




^2 




\* 


3 


3 1 


(xA 




( 4 ) 






3 


5 ) 


UJ 
(S) 


= 


(4) 







( 


A) 


= 


tti (2) + " 2 (0, 


( 


A) 


= 


(2 o)U)' 




GO 


= 


(!)■ 


xA 

x 2/ 


i- 




K0 + 1(J) 



(8.55) 

(8.56) 

(8.57) 
\ "2 / \ if / 

So 

/ „. \ 1 / 1 \ 9.K / 1 \ 

(8.58) 

linear combination of row space vectors 

We could also have chosen to expand in terms of the other row space vector (1, 1) T , since any two of 
the three row space vectors span the space M 2 . 

The vector A ■ x p actually represents the projection of b onto the subspace spanned by the column 
vectors (i.e. the column space). Call the projected vector b p : 

bp = A-Xp = A-(A T - A)" 1 ■ A T -b. (8.59) 

projection matrix, P 

For this example b p = (5/6, 11/6, 4/3) . We can think of h p as the shadow cast by b onto the column 
space. Here, following Eq. (|7.160p . we have the projection matrix P as 

P = A-(A T - A)- 1 • A T . (8.60) 
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Figure 8.1: Plot for b which lies outside of column space (space spanned by ci and C2) of A. 



A sketch of this system is shown in Fig. 18.11 Here we sketch what might represent this example 
in which the column space of A does not span the entire space M 3 , and for which b lies outside of 
the column space of A. In such a case ||A • x p — t>| 1 2 > 0. We have A as a matrix which maps 
two-dimensional vectors x into three-dimensional vectors b. Our space is M 3 , and embedded within 
that space are two column vectors Ci and C2 which span a column space M. 2 , which is represented by a 
plane within a three-dimensional volume. Since b lies outside the column space, there exists no unique 
vector x for which A • x = b. 

I 



I 

Example 8.8 

For xe R 2 , b e 



consider A 




.61) 



The column space of A is spanned by the two column vectors 

(8.62) 

Our equation can also be cast in the form which makes the contribution of the column vectors obvious: 

(8.63) 






Here we have the unusual case that b = (5, 1, 3) T is in the column space of A (in fact b = ci + 2C2), 
and we have a unique solution of 

(8.64) 
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Figure 8.2: Plot for b which lies in column space (space spanned by Ci and C2) of A. 



In most cases, however, it is not obvious that b lies in the column space. We can still operate on 
both sides by the transpose and solve, which will reveal the correct result: 





0. So, 



(8.65) 

(8.66) 

(8.67) 
have an exact solution for 



A quick check of the residual shows that in fact r = A ■ x p — 
which x = x p . 

Note that the solution vector x lies entirely in the row space of A; here, it is identically the first row 
vector ri = (1, 2) T . Note also that here the column space is a two-dimensional subspace, in this case a 
plane defined by the two column vectors, embedded within a three-dimensional space. The operator A 
maps arbitrary two-dimensional vectors x into the three-dimensional b; however, these b vectors are 
confined to a two-dimensional subspace within the greater three-dimensional space. Consequently, we 
cannot always expect to find a vector x for arbitrary b! 

A sketch of this system is shown in Fig. 18.21 Here we sketch what might represent this example 
in which the column space of A does not span the entire space R 3 , but for which b lies in the column 
space of A. In such a case ||A • x — t> 1 1 2 = 0. We have A as a matrix which maps two-dimensional 
vectors x into three-dimensional vectors b. Our space is R 3 and embedded within that space are two 
column vectors Ci and C2 which span a column space R 2 , which is represented by a plane within a 
three-dimensional volume. Since b in this example happens to lie in the column space, there exists a 
unique vector x for which A • x = b. 

I 



8.2.4.2 Under-constrained systems 



I 

Example 8.9 

Consider now A : 



such that 




(8.68) 
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In this case operating on both sides by the transpose is not useful because (A T • A) 
exist. We take an alternate strategy. 

Certainly b = (1,3) T lies in the column space of A, since for example, b = 0(1, 2) T 
3(1, 1) T . Setting x\ = t, where t is an arbitrary number, lets us solve for X2,xs: 



does not 



2(1, 0) 1 



1 

2 

1 1 
1 




1 
3 

- t 

21 



Inversion gives 





-2 + t 
3-2i 



+ t 



te 



(8.69) 

(8.70) 
(8.71) 
(8.72) 



right null space 

A useful way to think of problems such as this which are undetermined is that the matrix A maps 
the additive combination of a unique vector from the row space of A plus an arbitrary vector from the 
right null space of A into the vector b. Here the vector (1, 1, — 2) T is in the right null space; however, 
the vector (0, — 2,3) T has components in both the right null space and the row space. Let us extract 
the parts of (0, —2, 3) T which are in each space. Since the row space and right null space are linearly 
independent, they form a basis, and we can say 



(8.73) 




right null space 



In matrix form, we then get 




(8.74) 



in vert ible 

The coefficient matrix is non-singular and thus invertible. Solving, we get 





(8.75) 



So x can be rewritten as 




te 



right null space 



The first two terms in the right-hand side of Eq. (18.761) are the unique linear combination 
space vectors, while the third term is from the right null space. As by definition, A maps 



(8.76) 



of the row 
any vector 
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from the right null space into the zero element, it makes no contribution to forming b; hence, one can 
allow for an arbitrary constant. Note the analogy here with solutions to inhomogeneous differential 
equations. The right null space vector can be thought of as a solution to the homogeneous equation, 
and the terms with the row space vectors can be thought of as particular solutions. 

We can also write the solution x in matrix form. The matrix is composed of three column vectors, 
which are the original two row space vectors and the right null space vector, which together form a 
basis in R 3 : 



2 


1 





1 


1 


-2 




t e R 1 . (8.77) 

While the right null space vector is orthogonal to both row space vectors, the row space vectors are not 
orthogonal to themselves, so this basis is not orthogonal. Leaving out the calculational details, we can 
use the Gram-Schmidt procedure to cast the solution on an orthonormal basis: 



tel 1 . (8.78) 



low space right nuU space 

The first two terms are in the row space, now represented on an orthonormal basis, the third is in the 
right null space. In matrix form, we can say that 




1 

AS 

1 


i 
i 


7S \ 


/ 73 


73 


x/2 


V6 


V2 


i 





V 3 / 


\V6(t- 



iet 1 . (8.79) 



Of course, there are other orthonormal bases on which the system can be cast. 

We see that the minimum length of the vector x occurs when t = 4/3, that is when x is entirely in 
the row space. In such a case we have 



mm x 2 



7f) + (^) 2 = V^- (8 - 80) 



Lastly note that here, we achieved a reasonable answer by setting x% = t at the outset. We could 
have achieved an equivalent result by starting with x^ = t, or X3 = t. This will not work in all problems, 
as will be discussed in Sec. 18.8.131 on row echelon form. 

I 



8.2.4.3 Simultaneously over- and under-constrained systems 

Some systems of equations are both over- and under- constrained simultaneously. This often 
happens when the rank r of the matrix is less than both TV and M, the matrix dimensions. 
Such matrices are known as less than full rank matrices. 
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I 

Example 8.10 

Consider A : M 4 



such that 





.81) 



1 


-1/2 


-1/2 


1 


1/4 


9/4 













Using elementary row operations to perform Gaussian elimination gives rise to the equivalent sys- 
tem: 



(8.82) 



We immediately see that there is a problem in the last equation, which purports = 1! What is actually 
happening is that A is not full rank r = 3, but actually has r = 2, so vectors x e M 4 are mapped 
into a two-dimensional subspace. So, we do not expect to find any solution to this problem, since our 
vector b is an arbitrary three-dimensional vector which most likely does not lie in the two-dimensional 
subspace. We can, however, find an x which minimizes the Euclidean norm of the residual. We return 
to the original equation and operate on a both sides with A T to form A T ■ A • x = A T • b. It can be 
easily verified that if we chose to operate on the system which was reduced by Gaussian elimination 
that we would not recover a solution which minimized II A • x — bll! 










(8.83) 



(8.84) 



This operation has mapped both sides of the equation into the same space, namely, the column space 
of A T , which is also the row space of A. Since the rank of A is r = 2, the dimension of the row space 
is also two, and now the vectors on both sides of the equation have been mapped into the same plane. 
Again using row operations to perform Gaussian elimination gives rise to 



1 -1/2 

1 1/4 9/4 



.0 ) 

This equation suggests that here x-$ and X4 are arbitrary, so we set X3 
t as known quantities, reduce the system to the following 





(8.85) 



s, £4 = t and, treating s and 



1 
1 



/1/4 
17/8- 



-s/2 
s/4- 



-t/2\ 
9t/4j 



(8.86) 





(8.87) 
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The vectors which are multiplied by s and t are in the right null space of A. The vector (1/4, 7/8, 0, 0) T 
is not entirely in the row space of A; it has components in both the row space and right null space. We 
can, thus, decompose this vector into a linear combination of row space vectors and right null space 
vectors using the procedure in the previous section, solving the following equation for the coefficients 
Oi, . . . , 04, which are the coefficients of the row and right null space vectors: 



/l/4\ 

7/8 



(I 



3 1/2 

2 2 -1/4 
-1 1 
4 3 



1/2 \ /oi\ 

-9/4 a 2 

; ) \z) 



Solving, we get 




-3/244 
29/244 
29/244 

-75/244, 



So we can recast the solution as 




(8.90) 



right null space 



This choice of x guarantees that we minimize ||A ■ x — b|| 2 , which in this case is 1.22474. So there are 
no vectors x which satisfy the original equation A • x = b, but there are a doubly infinite number of 
vectors x which can minimize the Euclidean norm of the residual. 

We can choose special values of s and t such that we minimize | |x| | 2 while maintaining 1 1 A • x — b| | 2 
at its global minimum. This is done simply by forcing the magnitude of the right null space vectors to 
zero, so we choose s = —29/244, t = 75/244, giving 




21/61 

13/61 

-29/244 

75/244 



.91) 



row space 



This vector has llxl 



0.522055. 



8.2.4.4 Square systems 

A set of N linear algebraic equations in TV unknowns can be represented as 



(8.92) 



There is a unique solution if det A ^ and either no solution or an infinite number of 
solutions otherwise. In the case where there are no solutions, one can still find an x which 
minimizes the normed residual IIA ■ x — blU. 
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Theorem 

(Cramer's rule) The solution of the equation is 



■t i 



det Aj 
det A 



(8.93) 



where Aj is the matrix obtained by replacing the i-th column of A by y. While generally 
valid, Cramer's rule is most useful for low dimension systems. For large systems, Gaussian 
elimination is a more efficient technique. 



I 

Example 8.11 

For A: R 2 



Solve for x in A ■ x = b: 



1 2 
3 2 



(8.94) 



By Cramer's rule 



So 



Xl 



■r-2 



4 


2 


5 


2 


1 


2 


3 


2 



1 


4 


3 


5 


1 


2 


3 


2 



-2 _ 1 
^4 ~ 2' 



-7 _ 7 
li ~ 4' 



(8.95) 



(8.96) 



(8.97) 



We get the same result by Gaussian elimination. Subtracting three times the first row from the second 
yields 

-AJ \x 2 ) = \-l / 
Thus, X2 = 7/4. Back substitution into the first equation then gives x\ = 1/2. 



(8.98) 



I 

Example 8.12 

With A : R 2 



find the most general x which best satisfies A • x = b for 



1 2 
3 6 



.99) 



Obviously, there is no unique solution to this system since the determinant of the coefficient matrix 
is zero. The rank of A is 1, so in actuality, A maps vectors from 1R 2 into a one-dimensional subspace, 
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Mr. For a general b, which does not lie in the one-dimensional subspace, we can find the best solution 
x by first multiplying both sides by A T : 

s\ fi A {*i\ fi A fA (8100) 

(8.101) 



10 20 \ ( xi\ (2 



20 40 7 \X2j \4, 

This operation maps both sides of the equation into the column space of A T , which is the row space 
of A, which has dimension 1. Since the vectors are now in the same space, a solution can be found. 
Using row reductions to perform Gaussian elimination, we get 

We set x% = t, where t is any arbitrary real number and solve to get 

^M'oXf'- 

The vector which t multiplies, (—2, 1) , is in the right null space of A. We can recast the vector 
(1/5, 0) T in terms of a linear combination of the row space vector (1, 2) T and the right null space vector 
to get the final form of the solution: 

sHGM'-IXl 2 )- 



row space right null space 

This choice of x guarantees that the Euclidean norm of the residual ||A ■ x — t> 1 1 2 is minimized. In this 
case the Euclidean norm of the residual is 1.89737. The vector x with the smallest norm that minimizes 
I A • x — b|| 2 is found by setting the magnitude of the right null space contribution to zero, so we can 
take t = 2/25 giving 

'xA 1/1' 



This gives rise to ||x|| 2 = 0.0894427. 



x 2 J 25 V2< (8 ' 105) 



8.3 Eigenvalues and eigenvectors 

8.3.1 Ordinary eigenvalues and eigenvectors 

Much of the general discussion of eigenvectors and eigenvalues has been covered in Chap. [7J 
see especially Sec. 17.4.41 and will not be repeated here. A few new concepts are introduced, 
and some old ones reinforced. 

First, we recall that when one refers to eigenvectors, one typically is referring to the right 
eigenvectors which arise from A • e = AI • e; if no distinction is made, it can be assumed 
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that it is the right set that is being discussed. Though it does not arise as often, there are 
occasions when one requires the left eigenvectors which arise from e T • A = e T • IA. Some 
important properties and definitions involving eigenvalues are listed next: 

• If the matrix A is self- adjoint, it can be shown that it has the same left and right 
eigenvectors. 

• If A is not self-adjoint, it has different left and right eigenvectors. The eigenvalues are 
the same for both left and right eigenvectors of the same operator, whether or not the 
system is self-adjoint. 

• The polynomial equation that arises in the eigenvalue problem is the characteristic 
equation of the matrix. 

• The Cayley-Hamilton3 theorem states that a matrix satisfies its own characteristic 
equation. 

• If a matrix is triangular, then its eigenvalues are its diagonal terms. 

• Eigenvalues of A • A = A 2 are the square of the eigenvalues of A. 

• Every eigenvector of A is also an eigenvector of A 2 . 

• A matrix A has spectral radius, p(A), defined as the largest of the absolute values of 
its eigenvalues: 

p(A) = max(|A n |). (8.106) 

n 

• Recall from Eq. (17.3011) that a matrix A has a spectral norm, ||A|| 2 where 



||A|| 2 = Jmax(Ki), (8.107) 

where for real valued A, Ki is an eigenvalue of A r • A. Note in general p(A) ^ ||A|| 2 . 
If A is self-adjoint, p(A) = ||A|| 2 . 
In general, Gelfand'cl formula holds 

p(A) = lim ||A fc || 1/fc . (8.108) 

k— >oo 

The norm here holds for any matrix norm, including our spectral norm. 
The trace of a matrix is the sum of the terms on the leading diagonal. 



2 after |Arthur Cayley[ 1821-1895, English mathematician, and William Rowan Hamilton, 1805-1865, 
Anglo-Irish mathematician. 

3 Israel Gelfand, 1913-2009, Soviet mathematician. 
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• The trace of a N x N matrix is the sum of its iV eigenvalues. 

• The product of the N eigenvalues is the determinant of the matrix. 



I 

Example 8.13 

Demonstrate the theorems and definitions just described for 




(8.109) 



The characteristic equation is 

A 3 -6A 2 + llA-6 = 0. (8.110) 

The Cayley-Hamilton theorem is easily verified by direct substitution: 

A 3 -6A 2 + 11A-6I = 0, (8.111) 




-30 19 -38 \ / 36 -30 60 \ / 11 -22 
-10 13 -24 + -12 -18 24+22 11 
52 -26 53 / \ -96 48 -102/ l 44 -22 55 




.112) 



0\ 

0. (8.113) 

0/ 



Considering the traditional right eigenvalue problem, Ae = AI-e, it is easily shown that the eigenvalues 
and (right) eigenvectors for this system are 

Ai = 1, ei = 2 I , (8.114) 



( 1 , (8.115) 



A 3 = 3, e 3 = -1 . (8.116) 
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One notes that while the eigenvectors do form a basis in M 3 , that they are not orthogonal; this is a 
consequence of the matrix not being self-adjoint (or more specifically asymmetric). The spectral radius 
is p(A) = 3. Now 

A 2 = A ■ A = | 2 1 ) I 2 1 0| = |2 3 -4 |. (8.117) 

It is easily shown that the eigenvalues for A 2 are 1,4,9, precisely the squares of the eigenvalues of A. 
The trace is 

tr(A) =0 + 1 + 5 = 6. (8.118) 

Note this is the equal to the sum of the eigenvalues 

3 

^A i = 1 + 2 + 3 = 6. (8.119) 

i=l 

Note also that 

det A = 6 = AiA 2 A 3 = (1)(2)(3) = 6. (8.120) 

Note that since all the eigenvalues are positive, A is a positive matrix. It is not positive definite. Note 
for instance if x = (—1, 1, 1) T , that x T ■ A ■ x = — 1. We might ask about the positive definiteness of 
the symmetric part of A, A s = (A + A T )/2 : 



.121) 



In this case A s has real eigenvalues, both positive and negative, Ai = 5.32, A2 = —1.39, A3 = 2.07. 
Because of the presence of a negative eigenvalue in the symmetric part of A, we can conclude that both 
A and A s are not positive definite. 

We also note that for real- valued problems x G M. N , A G M. NxN , the antisymmetric part of a matrix 
can never be positive definite by the following argument. We can say x T ■ A ■ x = x T • (A s + A Q ) ■ x. 
Then one has x T ■ A a • x = for all x because the tensor inner product of the real antisymmetric A a 
with the symmetric x T and x is identically zero. So to test the positive definiteness of a real A, it 
suffices to consider the positive definiteness of its symmetric part: x T ■ A s ■ x > 0. 

For complex- valued problems, x € C^, A € C NxN , it is not quite as simple. Recalling that the 
eigenvalues of an antisymmetric matrix A Q are purely imaginary, we have, if x is an eigenvector of A a , 
that x ■ A a ■ x = x ■ (A)x = x ■ (iA/)x = iXjx ■ x = iA7||x||§, where A/ £ K 1 . Hence whenever the 
vector x is an eigenvector of A a , the quantity x • A a • x is a pure imaginary number. 

We can also easily solve the left eigenvalue problem, ej ■ A = AeJ • I: 




Ai = 1, e (L1) = -1 , (8.122) 




A 2 = 2, e L2 = 1 , (8.123) 

A3 = 3, e L3 = -1 ■ (8.124) 
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We see eigenvalues are the same, but the left and right eigenvectors are different. 

We find ||A||2 by considering eigenvalues of A T • A, the real variable version of that described in 
Eq. (TTMTjl : 




.125) 



.126) 



This matrix has eigenvalues K = 49.017, 5.858, 0.125. The spectral norm is the square root of the largest, 
giving 

|A|| 2 = V49.017= 7.00122. (8.127) 

The eigenvector of A T • A corresponding to k = 49.017 is ei = (0.5829, -0.2927, 0.7579) T . When we 
compute the quantity associated with the norm of an operator, we find this vector maps to the norm: 



|A-e 



1112 



0.5829 \ 
-0.2927 
0.7579 / 



- 1.80863 \ 
0.873144 
6.70698 J 



\ e l\\2 



0.582944 \ 
-0.292744 
0.757943 J 



7.00122. 



.128) 



Had we chosen the eigenvector associated with the eigenvalue of largest magnitude, e3 = (—1,-1, 1) T , 
we would have found ||A ■ ^3 1 I2/I l e 3 1 12 = 3, the spectral radius. Obviously, this is not the maximum of 
this operation and thus cannot be a norm. 



fci|i/fe 



We can easily verify Gelfand's theorem by direct calculation of ||A \\ 2 
following. 



for various k. We find the 



k 



feni/fe 



As k 



1 


7.00122 


2 


5.27011 


3 


4.61257 


4 


4.26334 


5 


4.03796 


10 


3.52993 


100 


3.04984 


000 


3.00494 


DC 


3 



,fclll/fc 



approaches the spectral radius p(A) = 3. 



8.3.2 Generalized eigenvalues and eigenvectors in the second sense 



On p. I288[ we studied generalized eigenvectors in the first sense. Here we consider a dis- 
tinct problem which leads to generalized eigenvalues and eigenvectors in the second sense. 
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Consider the problem 

A e = AB e, (8.129) 

where A and B are square matrices, possibly singular, of dimension TV x N, e is a generalized 
eigenvector in the second sense, and A is a generalized eigenvalue. If B were not singular, 
we could form (B _1 • A) • e = AI • e, which amounts to an ordinary eigenvalue problem. But 
let us assume that the inverses do not exist. Then Eq. (18.1291) can be re-cast as 

(A-AB)-e = 0. (8.130) 

For non-trivial solutions, we simply require 

det(A-AB) = 0, (8.131) 

and analyze in a similar manner. 



I 

Example 8.14 

Find the generalized eigenvalues and eigenvectors in the second sense for 



A 
Here B is obviously singular. We rewrite as 




1-A 2\ (eA (0 

2-A 1 J \e 2 V 



1-A 2 
2-A 1 



1-3 2\ /ei\ (0 

2-3 lJ\e 2 J \0 

1 iJ'UJ = U 

By inspection, the generalized eigenvector in the second sense 



e 2 / 
satisfies Eq. (|8. 132[) when A = 3, and a is any scalar 



.132) 



.133) 



For a non-trivial solution, we require 

1 _ \ 9 

0, (8.134) 



which gives a generalized eigenvalue of 

1-A- 2(2 -A) = 0, (8.135) 

1-A-4 + 2A = 0, (8.136) 

A = 3. (8.137) 

For e, we require 



(8.138) 
(8.139) 



-<*(]), (8.140) 
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D 
1.0 

0.8 

0.6 

0.4 

0.2 



original unit square 



0.2 0.4 0.6 0.8 1.0 




representation in 
linearly mapped 
coordinates 




Figure 8.3: Unit square transforming via stretching and rotation under a linear area- and 
orientation-preserving alibi mapping. 

8.4 Matrices as linear mappings 

By considering a matrix as an operator which effects a linear mapping and applying it 
to a specific geometry, one can better envision the characteristics of the matrix. This is 
demonstrated in the following example. 



I 

Example 8.15 

Consider how the matrix 



.141) 



acts on vectors x, including those that form a unit square with vertices as A : (0,0), B : (1,0), C : 
(l,l),D:(0,l). 

The original square has area of A = 1. Each of the vertices map under the linear homogeneous 
transformation to 



1 
1 

C 



.142) 



.143) 



In the mapped space, the square has transformed to a parallelogram. This is plotted in Fig. 18.31 Here, 
the alibi approach to the mapping is clearly evident. We keep the coordinate axes fixed in Fig. 18.31 and 
rotate and stretch the vectors, instead of keeping the vectors fixed and rotating the axes, as would have 
been done in an alias transformation. Now we have 



detA=(0)(-l)-(l)(-l) = l. 



(8.144) 
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Thus, the mapping is both orientation- and area-preserving. The orientation-preserving feature is 
obvious by inspecting the locations of the points A, B, G, and D in both configurations shown in 
Fig. 18.31 We easily calculate the area in the mapped space by combining the areas of two triangles 
which form the parallelogram: 

A = \{1){1) + \{1){1) = 1. (8.145) 

The eigenvalues of A are —(1/2) ± ^/3/2, both of which have magnitude of unity. Thus, the spectral 
radius p(A) = 1. However, the spectral norm of A is non- unity, because 



-°, _\ • ! :! = -: ? 



(8.146) 



which has eigenvalues 

k=-(3±\/5). (8.147) 



The spectral norm is the square root of the maximum eigenvalue of A T ■ A, which is 



-(3 + a/5) = 1.61803. 



.148) 



It will later be shown, Sec. 18.8.41 that the action of A on the unit square can be decomposed into a 
deformation and a rotation. Both are evident in Fig. 18.31 

I 



8.5 Complex matrices 



If x and y are complex vectors, we know that their inner product involves the conjugate 
transpose. The conjugate transpose operation occurs so often we give it a name, the Her- 
mitian transpose, and denote it by a superscript H . Thus, we define the inner product 
as 

<x,y>=x T -y = x H -y. (8.149) 

Then the norm is given by 

||x|| 2 = +Vx^-x. (8.150) 



I 

Example 8.16 

If 




(8.151) 



find ||x|| 2 . 
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|x|| 2 = +Vx ff • X 



\ 




(1 - i, 3 + 2i, 2, +3i) = +V2 + 13 + 4 + 9 = 2V7. (8.152) 



I 

Example 8.17 
If 




(8.153) 

(8.154) 



find <x, y>. 



<x,y> = x ff -y, (8.155) 

/ 3 \ 

= (l-i,-2-3i,2 + i) 4 - 2i , (8.156) 

\3 + 3i/ 

= (3-3i) + (-14-8i) + (3 + 9i), (8.157) 

= -8-2i. (8.158) 



J 



Likewise, the conjugate or Hermitian transpose of a matrix A is A^ , given by the trans- 
pose of the matrix with each element being replaced by its conjugate: 

A H = A T . (8.159) 

As the Hermitian transpose is the adjoint operator corresponding to a given complex matrix, 
we can apply an earlier proved theorem for linear operators, Sec. 17.4.41 to deduce that the 
eigenvalues of a complex matrix are the complex conjugates of the Hermitian transpose of 
that matrix. 

The Hermitian transpose is distinguished from a matrix which is Hermitian as follows. A 
Hermitian matrix is one which is equal to its conjugate transpose. So a matrix which equals 
its Hermitian transpose is Hermitian. A matrix which does not equal its Hermitian transpose 
is non-Hermitian. A skew-Hermitian matrix is the negative of its Hermitian transpose. A 
Hermitian matrix is self-adjoint. 

Properties: 
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• x H • A • x is real if A is Hermitian. 

• The eigenvalues of a Hermitian matrix are real. 

• The eigenvectors of a Hermitian matrix that correspond to different eigenvalues, are 
orthogonal to each other. 

• The determinant of a Hermitian matrix is real. 

• The spectral radius of a Hermitian matrix is equal to its spectral norm, p(A) = ||A||2. 

• If A is skew- Hermitian, then iA. is Hermitian, and vice-versa. 

Note the diagonal elements of a Hermitian matrix must be real as they must be unchanged 
by conjugation. 



I 

Example 8.18 

Consider A • x = b, where A : C 3 — > C 3 with A the Hermitian matrix and x the complex vector: 

(8.160) 





First, we have 

b = A-x=|2 + i -3 2i I -1 [9 + llij. (8.161) 

\ 3 -1% 4/ \ 2-i J \17 + 4iJ 

Now, demonstrate that the properties of Hermitian matrices hold for this case. First 

/ 1 2-i 3\ /3 + 2A 

x ff -A-x=(3-2i -1 2 + i) \2 + i -3 2i -1 = 42 g R 1 . (8.162) 






The eigenvalues and (right, same as left here) eigenvectors are 

0.525248 
Ai 6.51907, ei = | 0.132451 + 0.223964J | , (8.163) 

0.803339-0.105159i, 

-0.745909 
A 2 -0.104237, e 2 = -0.385446 + 0.0890195i ) , (8.164) 

0.501844-0.187828? 

0.409554 \ 

e 3 = -0.871868-0.125103i . (8.165) 

-0.116278 -0.207222i/ 

By inspection p(A) = 6.51907. Because A is Hermitian, we also have ||A||2 = p(A) = 6.51907. We find 
this by first finding the eigenvalues of A H ■ A, which are 42.4983, 19.4908, and 0.010865. The square 
roots of these are 6.51907, 4.41484, and 0.104237; the spectral norm is the maximum, 6.51907. 
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Check for orthogonality between two of the eigenvectors, e.g. ei,e2: 

<ei,e 2 > = ef-e 2 , (8.166) 

/ -0.745909 \ 

= (0.525248 0.132451 - 0.223964i 0.803339 + 0.105159J) • -0.385446 + 0.0890195i , 

\ 0.501844-0.187828i / 

(8.167) 

= + 0i. (8.168) 

The same holds for other eigenvectors. It can then be shown that 

detA = 3, (8.169) 

which is also equal to the product of the eigenvalues. This also tells us that A is not volume-preserving, 
but it is orientation-preserving. 
Lastly, 

/ i l + 2i 3i\ 

iA=\-l + 2i -3i -2 , (8.170) 

\ 3i 2 U ) 

is skew-symmetric. It is easily shown the eigenvalues of «A are 

Ai = 6.51907i, A 2 = -0.104237i, A 3 = -4.41484?. (8.171) 

Note the eigenvalues of this matrix are just those of the previous multiplied by i. 



8.6 Orthogonal and unitary matrices 

8.6.1 Orthogonal matrices 

Expanding on a topic introduced on p. [2H discussed on p. 11831 and briefly discussed on 
p. 12871 a set of iV iV-dimensional real orthonormal vectors {e!,e 2 , ■ • • ,e^v} can be formed 
into an orthogonal matrix 

/: : : \ 



Q = ei e 2 ... e w 
\\ \ \ ) 

Properties of orthogonal matrices include 

1. Q T = Q _1 , and both are orthogonal. 

2. Q T Q = Q Q T = I. 

3- HQlb = 1) when the domain and range of Q are in Hilbert spaces. 

4. | |Q • x| | 2 = | |x| | 2 , where x is a vector. 
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5. (Q • x) T • (Q • y) = x T • y, where x and y are real vectors. 

6. Eigenvalues of Q have |Aj| = 1, Aj G C 1 , thus, p(Q) = 1. 

7. | det Q | = 1. 

Geometrically, an orthogonal matrix is an operator which transforms but does not stretch 
a vector. For an orthogonal matrix to be a rotation, which is orientation-preserving, we must 
have det Q = 1. Rotation matrices, reflection matrices, and permutation matrices P are all 
orthogonal matrices. Recall that permutation matrices can also be reflection or rotation 
matrices. 



I 

Example 8.19 

Find the orthogonal matrix corresponding to 

1 2 • 

The normalized eigenvectors are (l/\/2, l/v2) T and (— \/y/2 1 l/y/2) T . The orthogonal matrix is 

thus 

/ j__ 

V2 




Q = ( \ 2 (8.174) 

In the sense of Eq. (j6.54[) , we can say 

Q=f COS f -***), (8.175) 

Vsmf cosf J 

and the angle of rotation of the coordinate axes is a = 7r/4. We calculate the eigenvalues of Q to be 
A = (1 ± i)/v2, which in exponential form becomes A = exp(±«7r/4), and so we see the rotation angle 
is embedded within the argument of the polar representation of the eigenvalues. We also see |A| = 1. 
Note that Q is not symmetric. Also note that det Q = 1, so this orthogonal matrix is also a rotation 
matrix. 

If £ is an unrotated Cartesian vector, and our transformation to a rotated frame is £ = Q x, so that 
x = Q T • £, we see that the Cartesian unit vector £ = (1, 0) T is represented in the rotated coordinate 
system by 



Thus, the counterclockwise rotation of the axes through angle a = 7r/4 gives the Cartesian unit vector 
(1, 0) T a new representation of (1/V2, — l/v2) T . We see that the other Cartesian unit vector £ = (0, 1) T 
is represented in the rotated coordinate system by 

f r>\ / 1_ \ 

(8.177) 




Had det Q = — 1, the transformation would have been non-orientation preserving. 
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I 

Example 8.20 

Analyze the three-dimensional orthogonal matrix 




Q= 73 "75 7s • (8-178) 



Direct calculation reveals ||Q||2 = 1, det Q = 1, and Q T = Q _1 , so clearly the matrix is a volume- 
and orientation-preserving rotation matrix. It can also be shown to have a set of eigenvalues and 
eigenvectors of 

/ 0.886452 \ 
\ 1 = l, ei = 0.36718 , (8.179) 

\ 0.281747/ 

-0.18406 + 0.27060i \ 
exp(2.9092i), e 2 = -0.076240 - 0.65328K , (8.180) 

0.678461 / 

-0.18406 -0.27060i 
exp(-2.9092i), e 3 = -0.076240 + 0.653281? ) . (8.181) 

0.678461 

As expected, each eigenvalue has |A| = 1. It can be shown that the eigenvector ei which is associated 
with real eigenvalue, Ai = 1, is aligned with the so-called Euler axis, i.e. the axis in three-space about 
which the rotation occurs. The remaining two eigenvalues are of the form exp(±m), where a is the 
angle of rotation about the Euler axis. For this example, we have a = 2.9092. 

I 



I 

Example 8.21 

Consider the composite action of three rotations on a vector x: 



Qi • Q2 • Q3 ■ x = I cosai — sinai • 1 • sin 013 cosa 3 -x. (8.182) 




It is easy to verify that HQ1II2 = HQ2II2 = IIQ3II2 = 1, detQi = det Q2 = detQ 3 = 1, so each is a 
rotation. For Q 3 , we find eigenvalues of A = 1, cosa 3 ±z sina 3 . These can be rewritten as A = 1, e ±c * 31 . 
The eigenvector associated with the eigenvalue of 1 is (0,0, 1). Thus, we can consider Q 3 to effect a 
rotation of a 3 about the 3-axis. Similarly, Q2 effects a rotation of Q2 about the 2-axis, and Qi effects 
a rotation of ot\ about the 1-axis. 

So the action of the combination of rotations on a vector x is an initial rotation of a 3 about the 
3-axis: Q 3 • x. This vector is then rotated through ai about the 2-axis: Q2 ■ (Q 3 ■ x). Finally, there 
is a rotation through a\ about the 1-axis: Qi • (Q2 ■ (Q 3 ■ x)). This is called a 3-2-1 rotation through 
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the so-called Euler angles of «3, a 2 , and ot\. Note because in general matrix multiplication does not 
commute, that the result will depend on the order of application of the rotations, e.g. 

Qi Q 2 Q 3 x^Q 2 Qi Q 3 x, Vxet 3 , Qi,Q 2 ,Q 3 e K 3 x R 3 . (8.183) 

In contrast, it is not difficult to show that rotations in two dimensions do commute 

Qi Q 2 Qs x=Q 2 Qi Q 3 x, VxeK 2 , Qi,Q 2 ,Q 3 e K 2 x R 2 . (8.184) 



8.6.2 Unitary matrices 

A unitary matrix U is a complex matrix with orthonormal columns. It is the complex analog 
of an orthogonal matrix. 

Properties of unitary matrices include 

U^ = U _1 , and both are unitary. 

u ff • U = U • V H = I. 

1 1 TLJ 1 1 2 = 1, when the domain and range of U are in Hilbert spaces. 

| |U ■ x| I2 = | |x| I2, where x is a vector. 

(U • x.) H ■ (U • y) = x H • y, where x and y are vectors. 

Eigenvalues of U have |A,| = 1, A* G C 1 , thus, p(U) = 1. 

Eigenvectors of U corresponding to different eigenvalues are orthogonal. 

|detU| = 1. 

If det U = 1, one is tempted to say the unitary matrix operating on a vector induces a pure 
rotation in a complex space; however, the notion of an angle of rotation is elusive. 



I 

Example 8.22 

Consider the unitary matrix 



C l+i l-2i \ 
f M • ( 8 - 185 ) 

V3 VT5 J 



The column vectors are easily seen to be normal. They are also orthogonal: 



l-2i \ 

tfl =0 + 0i (8.186) 



\CC BY-NC-TW} 29 July 2012, Sen & Powers 



356 CHAPTER 8. LINEAR ALGEBRA 



The matrix itself is not Hermitian. Still, its Hermitian transpose exists: 

V H =(^ kX (8-187) 

\i/i5 TfiT/ 



It is then easily verified that 



The eigensystem is 



U" 1 =U ff , (8.188) 

U-U H = U ff -U = I. (8.189) 



X 1 = -0.0986232 + 0.995125*, ei = ( °- 6881 ^ gg^ 5325 *) , (8.190) 

x nQ^i79j.n«RR99.- / -0.306358 + 0.5016332 \ 

A 2 = 0.934172 + 0.356822*, ^ 2 = (^ _ . 721676 _ . 36564 , J • (8-191) 

It is easily verified that the eigenvectors are orthogonal and the eigenvalues have magnitude of one. We 
find detU = (l + 2i)/y/E, which yields |detU| = 1. Also, ||U|| 2 = 1. 

I 



8.7 Discrete Fourier transforms 

It is a common practice in experimental and theoretical science and engineering to decompose 
a function or a signal into its Fourier modes. The amplitudes of these modes is often a 
useful description of the function. A Fourier transform is a linear integral operator which 
operates on continuous functions and yields results from which amplitudes of each frequency 
component can be determined. Its discrete analog is the Discrete Fourier transform (DFT). 
The DFT is a matrix which operates on a vector of data to yield a vector of transformed 
data. There exists a popular, albeit complicated, algorithm to compute the DFT, known 
as the Fast Fourier Transform (FFT). This will not be studied here; instead, a simpler and 
slower method is presented, which will be informally known as a Slow Fourier Transform 
(SFT). This discussion will simply present the algorithm for the SFT and demonstrate its 
use by example. 

The Fourier transform (FT) Y(k) of a function y(x) is defined as 

/oo 
y(x)e- {27Ti)KX dx, (8.192) 

■oo 

and the inverse FT is defined as 

/oo 
F(K)e (2m)KI dK. (8.193) 

-oo 

Here k is the wavenumber, and is the reciprocal of the wavelength. The FT has a discrete 
analog. The connection between the two is often not transparent in the literature. With some 
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effort a connection can be made at the expense of diverging from one school's notation to 
the other's. Here, we will be satisfied with a form which demonstrates the analogs between 
the continuous and discrete transform, but will not be completely linked. To make the 
connection, one can construct a discrete approximation to the integral of the FT, and with 
some effort, arrive at an equivalent result. 

For the DFT, consider a function y(x), x G [x m i n , x max ], x G M},y G R 1 . Now discretize 
the domain into TV uniformly distributed points so that every Xj is mapped to a yj for 
j = 0, . . . ,N— 1. Here we comply with the traditional, yet idiosyncratic, limits on j which are 
found in many texts on DFT. This offsets standard vector and matrix numbering schemes by 
one, and so care must be exercised in implementing these algorithms with common software. 
We seek a discrete analog of the continuous Fourier transformation of the form 

Here k plays the role of k, and c& plays the role of Y(k). For uniformly spaced Xj, one has 

j = (N- 1) ( x i-* mm \ , (8.195) 

\ X m ax X m i n J 

so that we then seek 

1 N ~ 1 f k'\ 

y^ = ~/^J2 Cfeex P ( (27rz) iv") ' 3 = 0,---,N-l. (8.196) 

Now consider the equation 



„N 



Z 



1, zeC 1 . (8.197) 



This equation has TV distinct roots 

z = e 27Ti », j = 0,...,7V-1, (8.198) 

Taking for convenience 

w = e 2ni/N , (8.199) 

one sees that the N roots are also described by w°, w 1 , w 2 , . . . , w 1 ^ -1 . Now define the following 

matrix 

/l 1 1 ... 1 \ 



1 



1 w w 2 ... w N 1 

1 w 2 w A ... w 2 ^- 1 ^ 



(8.200) 



It is easy to demonstrate for arbitrary iV that F is unitary, that is 

¥ H ¥ = l. (8.201) 
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Since F is unitary, it is immediately known that F H = F _1 , that ||F||2 = 1, that the 
eigenvalues of F have magnitude of unity, and that the column vectors of F are orthonormal. 
Note that F is not Hermitian. Also note that many texts omit the factor 1/yN in the 
definition of F; this is not a major problem, but does render F to be non-unitary 

Now given a vector y = Uj, j = 0, . . . , N — 1, the DFT is defined as the following mapping 

c = F H -y. (8.202) 

The inverse transform is trivial due to the unitary nature of F: 

(8.203) 
(8.204) 
(8.205) 
(8.206) 

Because our F is unitary, it preserves the length of vectors. Thus, it induces a Parseval's 
equation 

l|y||2 = ||c|| 2 . (8.207) 



F c = 


F-F H -y, 


F c = 


F-F-^y 


F c = 


i y, 


y = 


F c. 



I 

Example 8.23 

Consider a five term DFT of the function 

y = x 2 , EG [0,4]. (8.208) 

Take then for N = 5, a set of uniformly distributed points in the domain and their image in the 
range: 

x = 0, xi = 1, x 2 = 2, x 3 = 3, Xi = 4, (8.209) 

2/o = 0, yi = l, y 2 = 4, 2/3 = 9, 2/4 = 16. (8.210) 



Now for N = 5, one has 






w = e 2 **/5 = ( 


>- 


75)) + 




=SR(u 





The five distinct roots of z 5 


= 1 are 






z (0) 


= w° 




zM 


= w 1 




z^ 


= w 2 




2 (3) 


= w 3 




zW 


= w 4 



■(5 + V5) i = 0.309 + 0.951i. (8.211) 



s(w) 



1, (8.212) 

0.309 + 0.95H, (8.213) 

-0.809 + 0.588i, (8.214) 

-0.809 -0.588i, (8.215) 

0.309 -0.95H. (8.216) 
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The matrix F is then 



1 

7E 



i 
i 
Vi 



i 

w 

2 

w 

w'' 

4 

w 



w 



\ 



w 



1 

71 



1 

1 

Vi 



w 

c 

w 

r 

I/' 

1 



,.12 



1 

7! 

i 



'I 

l 
l 
Vi 



l\ 



w 
w 



w 



.217) 



V 



1 



0.309 + 0.951? 
-0.809 + 0.588? 
-0.809-0.588? 

0.309 -0.951? 



-0.809 + 0.588? 
0.309 - 0.951 i 
0.309 + 0.951? 

-0.809-0.588?' 



-0.809 -0.588? 
0.309 + 0-951* 
0.309-0.951? 

-0.809 + 0.588?' 



1 \ 

0.309 -0.951?" 

-0.809 - 0.588? 

-0.809 + 0.588? 

0.309 + 0.951? / 



.218) 



Now c 



?h 



y, so 



f'2 
C3 



1 

7E 



1 1 0.309-0.951? 



1 



0.809-0.588?' 
1 -0.809-0.588? 0.309 + 0.951? 
1 -0.809 + 0.588? 0.309-0.951? 
Vl 0.309 + 0.951? -0.809 + 0.588?' 

/ 13.416 \ 

2.354 + 7.694? ' 

4.354+1.816? 

4.354- 1.816? 
V -2.354- 7.694?7 



1 1 

-0.809 + 0.588?' 0.309 + 0.951? 

0.309 -0.951? -0.809 + 0.588? 

0.309 + 0.951? -0.809-0.588? 

-0.809 - 0.588?' 0.309-0.951?' / Vie/ 



.\/;\ 



.219) 



Now one is often interested in the magnitude of the components of c, which gives a measure of the 
so-called energy associated with each Fourier mode. So one calculates a vector of the magnitude of 
each component as 



x /cicT 

Vc 2 c5 

VC3C3 
V vclsj/ 



f\ Co \\ 

C-2 
,C3. 

V|c 4 |/ 



/ 



\ 



13.4164 
8.0463 
4.7178 
4.7178 
V 8.0463 / 



.220) 



Now due to a phenomena known as aliasing, explained in detail in standard texts, the values of Cfc 
which have the most significance are the first half Cfc, k = 0, . . . , N/2. 
Here 

||y|| 2 = ||c|| 2 = V354 = 18.8149. (8.221) 



.222) 
.223) 
.224) 
.225) 
.226) 



Note the 


t by construction 




2/0 


= 


1 I X 

—= (Co + Ci + c 2 + c 3 + c 4 ) , 




2/1 


= 


-±= (co + Cl e 2 -/ 5 + c 2 e 4 "/ 5 + c 3 e 6 "/ 5 + c 4 e 8 ™/ 5 ) , 




2/2 


= 


_L ( CQ + Cie 4 -/5 + C2e 8«/5 + C3e 12-75 + C4e 16-/5^ 




2/3 


= 


1 ( Co + Cl e 6 "/ 5 + c 2 e 12 "/ 5 + c 3 e 18 "/ 5 + c 4 e 24 "/ 5 
V5 ^ 




2/4 


= 


_L ( Co + Cl e 8 -/5 + C2e 16«/5 + C3e 24-/5 + C4e 32«/5 
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In general, it is seen that yj can be described by 

l N ~ 1 / ki\ 

% = 7f E c ^ ex P(( 27ri )^)' j = 0,...,N-l. (8.227) 

Realizing now that for a uniform discretization, such as done here, that 

Aj= J«^ !=] (8228) 

and that 

x j =jAx + x mmi j = 0,...,N-l, (8.229) 

one has 

Xi = J ( X ™-_ X ™A + Xmm , j = 0,...,N-l. (8.230) 



Solving for j, one gets 



j = (N-l)[ J _""" ), (8.231) 



SO 



that yj can be expressed as a Fourier-type expansion in terms of Xj as 



-j= f> exp (W ( ^) ( -S -_--" ) ) , j=0 JV-1. (8.232) 



fe=i 



ViV , ^ \ \ Jv / x^max A mm 



Here, the wavenumber of mode k, ftfc, is seen to be 



And as N — > oo, one has 



2V-1 , 

tfc = fc ^p-- ( 8 - 233 ) 



K fc ~ &. (8.234) 



I 

Example 8.24 

The real power of the DFT is seen in its ability to select amplitudes of modes of signals at certain 
frequencies. Consider the signal for x s [0,3] 

/ 2x\ ( 10x\ / 100a; \ 

y(x) = 10sin((27r)yj +2sinf (2tt) — J + sin f (2tt)— ^— J . (8.235) 

Rescaling the domain so as to take x € [0, 3] into x £ [0, 1] via the transformation x = x/3, one has 

y{x) = 10sin((27r)2£) + 2 sin ((2tt)10x) + sin ((2tt)100x) . (8.236) 

To capture the high wavenumber components of the signal, one must have a sufficiently large value of 
N. Note in the transformed domain that the smallest wavelength is A = 1/100 = 0.01. So for a domain 
length of unity, one needs at least N = 100 sampling points. In fact, let us choose to take more points, 
N = 523. There is no problem in choosing an unusual number of points for this so-called slow Fourier 
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Figure 8.4: Plot of a three term sinusoid y(x) and its discrete Fourier transform for N = 523. 
The first DFT is plotted from k = 0, . . . , N/2 and thus represents the original signal well. 
The second DFT is plotted from k = 0, . . . , N — 1 and exhibits aliasing effects at high k. 



transform. If an FFT were attempted, one would have to choose integral powers of 2 as the number of 
points. 

A plot of the function y(x) and two versions of its DFT, |cfc| vs. k, is given in in Fig. 18.41 Note that 
|cfc| has its peaks at k = 2, k = 10, and k = 100, equal to the wave numbers of the generating sine 
functions, n\ = 2, «2 = 10, and K3 = 100. To avoid the confusing, and non-physical, aliasing effect, 
only half the |cj,| values have been plotted the first DFT of Fig. 18.41 The second DFT here plots all 
values of |cfc| and thus exhibits aliasing for large k. 

I 



I 

Example 8.25 

Now take the DFT of a signal which is corrupted by so-called white, or random, noise. The signal 
here is given in x £ [0, 1] by 



y{x) = sin ((2tt)10x) + sin ((2tt)100z) + f ran d[-l, l](a;) 



.237) 



Here f ra nd[— 1, l](aO returns a random number between —1 and 1 for any value of x. A plot of the 
function y{x) and two versions of its 607 point DFT, \ck\ vs. k, is given in in Fig. 18.51 In the raw data 
plotted in Fig. 18.51 it is difficult to distinguish the signal from the random noise. But on examination 
of the accompanying DFT plot, it is clear that there are unambiguous components of the signal which 
peak at k = 10 and k = 100, which indicates there is a strong component of the signal with k = 10 and 
k = 100. Once again, to avoid the confusing, and non-physical, aliasing effect, only half the |cfe| values 
have been plotted in the first DFT of Fig. 18.51 The second DFT gives all values of \ck\ and exhibits 
aliasing. 
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Figure 8.5: Plot of a two-term sinusoid accompanied by random noise y(x) and its discrete 
Fourier transform for N = 607 points. The first DFT is plotted from k = 0, . . . , N/2 and 
thus represents the original signal well. The second DFT is plotted from k = 0, . . . , N — 1 
and exhibits aliasing effects at high k. 



8.8 Matrix decompositions 

One of the most important tasks, especially in the numerical solution of algebraic and dif- 
ferential equations, is decomposing general matrices into simpler components. A brief dis- 
cussion will be given here of some of the more important decompositions. Full discussions 
can be found in Strang's text. It is noted that many popular software programs, such as 
MATLAB, Mathematica, LAPACK libraries, etc. have routines which routinely calculate these 
decompositions. 



8.8.1 L • D • U decomposition 

Probably the most important technique in solving linear systems of algebraic equations of 
the form A • x = b, uses the decomposition 



A = P _1 L D U, 



(8.238) 
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where A is a square matrixp P is a never-singular permutation matrix, L is a lower trian- 
gular matrix, D is a diagonal matrix, and U is an upper triangular matrix. The notation of 
U for the upper triangular matrix is common, and should not be confused with the identical 
notation for a unitary matrix. In other contexts R is sometimes used for an upper trian- 
gular matrix, and P is sometimes used for a projection matrix. All terms can be found by 
ordinary Gaussian elimination. The permutation matrix is necessary in case row exchanges 
are necessary in the Gaussian elimination. 

A common numerical algorithm to solve for x in A • x = b is as follows 

• Factor A into P _1 • L • D • U so that A • x = b becomes 

P 1 L D U x = b. (8.239) 

A 

• Operate on both sides of Eq. (18.2391) with (P _1 • L • D)" 1 to get 

U x = (P-^L-D)" 1 -^ (8.240) 

• Solve next for the new variable c in the new equation 

P 1 L D c = b, (8.241) 

so 

c= (P^-L-D)" 1 -^ (8.242) 

The triangular form of L • D renders the inversion of (P _1 • L • D) to be much more 
computationally efficient than inversion of an arbitrary square matrix. 

• Substitute c from Eq. f)8.242p into Eq. f)8.240p . the modified version of the original 
equation, to get 

U x = c, (8.243) 

so 

x = IT 1 • c. (8.244) 

Again since U is triangular, the inversion is computationally efficient. 



I 

Example 8.26 

Find the L ■ D ■ U decomposition of the matrix: 



-5 4 9 
A = | -22 14 18 | . (8.245) 

16 -8 -6 



If A is not square, there is an equivalent decomposition, known as row echelon form, to be discussed in 
Sec. [878731 
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The process is essentially a series of row operations, which is the essence of Gaussian elimination. 
First we operate to transform the — 22 and 16 in the first column into zeroes. Crucial in this step is the 
necessity of the term in the 1,1 slot, known as the pivot, to be non-zero. If it is zero, a row exchange will 
be necessary, mandating a permutation matrix which is not the identity matrix. In this case there are 
no such problems. We multiply the first row by 22/5 and subtract from the second row, then multiply 
the first row by —16/5 and subtract from the third row. The factors 22/5 and —16/5 will go in the 2,1 
and 3,1 slots of the matrix L. The diagonal of L always is filled with ones. This row operation yields 

(8.246) 

Now multiplying the new second row by —4/3, subtracting this from the third row, and depositing the 
factor —4/3 into 3,2 slot of the matrix L, we get 

14 18 I 22/5 1 I -18/5 -108/5 I . (8.247) 



-5 


4 


9 


-22 


14 


18 


16 


-8 


-6 



1 0\ 


/" 5 


4 


9 


22/5 1 


° 


-18/5 


-108/5 


-16/5 1/ 


\o 


24/5 


114/5 





The form given in Eq. (|8.247[) is often described as the L ■ U decomposition of A. We can force the 
diagonal terms of the upper triangular matrix to unity by extracting a diagonal matrix D to form the 
L • D ■ U decomposition: 

/ 1 0\ /-5 \ (1 -4/5 -9/5 \ 

(8.248) 




Note that D does not contain the eigenvalues of A. Also since there were no row exchanges necessary 
P = P _1 = I, and it has not been included. 

I 



I 

Example 8.27 

Find the L • D ■ U decomposition of the matrix A: 



(8.249) 

There is a zero in the pivot, so a row exchange is necessary: 

/0 l\ /0 1 2\ /l 
P • A = 1 1 1 1 = 1 1 1 I . (8.250) 
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Performing Gaussian elimination by subtracting 1 times the first row from the second and depositing 
the 1 in the 2,1 slot of L, we get 



P-A = L-U-> 1 1 1 1 = 1 1 00 1 1 . (8.251 





Now subtracting 1 times the second row, and depositing the 1 in the 3,2 slot of L 

/0 l\ /0 1 2\ /l 0\ /l N 
P-A = L-U-> 1 1 1 1 = 1 1 00 1 II. (8.252) 





Now U already has ones on the diagonal, so the diagonal matrix D is simply the identity matrix. Using 
this and inverting P, which is P itself (!), we get the final decomposition 



P 1 L D U-> 1 1 1 = 1 01 1 00 1 00 1 1. (8.253 




8.8.2 Cholesky decomposition 

If A is a Hermitian positive definite matrix, we can define a Cholesky decomposition. Because 
A must be positive definite, it must be square. The Cholesky decomposition is as follows: 

A = V H ■ U. (8.254) 

Here U is an upper triangular matrix. One might think of U as the rough equivalent of the 
square root of the positive definite A. We also have the related decomposition 

A = V H D U, (8.255) 

where U is upper triangular with a value of unity on its diagonal, and D is diagonal. 

If we define a lower triangular L as L = U H , the Cholesky decomposition can be rewritten 
as 

A = L L H . (8.256) 

There also exists an analogous decomposition 

A = L D L H , (8.257) 

Note also that these definitions hold as well for real A; in such cases, we can simply replace 
the Hermitian transpose by the ordinary transpose. 
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I 

Example 8.28 

The Cholesky decomposition of a Hermitian matrix A is as follows 



/ V5 


^ 


4i 


3 


\ VE 


>/s/ 



-\i 1) 




U" 

Note the eigenvalues of A are A = 1, A = 9, so the matrix is indeed positive definite. 
We can also write in alternative form 

_ 6 « *)-e--»-*-(.v ;)•(; i)-(i f). «**> 

v ' V v 



8.8.3 Row echelon form 

When A is not square, we can still use Gaussian elimination to cast the matrix in row echelon 
form: 

A = P 1 L D U. (8.260) 

Again P is a never-singular permutation matrix, L is lower triangular and square, D is 
diagonal and square, U is upper triangular and rectangular and of the same dimension as 
A. The strategy is to use row operations in such a fashion that ones or zeroes appear on the 
diagonal. 



I 

Example 8.29 

Determine the row-echelon form of the non-square matrix, 



2 "0 3 3 1 



We take 2 times the first row and subtract the result from the second row. The scalar 2 is deposited 
in the 2,1 slot in the L matrix. So 

\ s i) -fijHiXil- (8 - 262) 

L U 

Again, Eq. (|8. 262ft is also known as an L • U decomposition, and is often as useful as the L ■ D ■ U 
decomposition. There is no row exchange so the permutation matrix and its inverse are the identity 
matrix. We extract a 1 and 6 to form the diagonal matrix D, so the final form is 

a-p--l.d. d -(j ;)_(; ;)(; ;)_(; -?ja. ,,», 

u 
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J 



Row echelon form is an especially useful form for under-constrained systems as illustrated 
in the following example. 



I 

Example 8.30 

Consider solutions for the unknown x in the equation A ■ x = b where A is known A : R 5 
and b is left general, but considered to be known: 



2 1-11 2 

4 2-2 1 

-2 -1 1 -2 -6 



X3 

X4 



.264) 



We perform Gaussian elimination row operations on the second and third rows to get zeros in the 
first column: 

( Xl \ 



2 1-11 2 
-1-4 
-1-4 



%2 
X3 

Xi 

\xj 
The next round of Gaussian elimination works on the third row and yields 




.265) 



2 


1 























1 1 







Note that the reduced third equation gives 



( Xl \ 

X2 

■r?, 

X4 
X 5 J 




.266) 



= 36i 



fa. 



.267) 



This is the equation of a plane in K 3 . Thus, arbitrary b 6 R 3 will not satisfy the original equation. 
Said another way, the operator A maps arbitrary five-dimensional vectors x into a two-dimensional 
subspace of a three-dimensional vector space. The rank of A is 2. Thus, the dimension of both the row 
space and the column space is 2; the dimension of the right null space is 3, and the dimension of the 
left null space is 1. 

We also note there are two non-trivial equations remaining. The first non-zero elements from the 
left of each row are known as the pivots. The number of pivots is equal to the rank of the matrix. 
Variables which correspond to each pivot are known as basic variables. Variables with no pivot are 
known as free variables. Here the basic variables are x\ and £4, while the free variables are X2, £3, and 
x 5 . 

Now enforcing the constraint 3&i — fa + fa = 0, without which there will be no solution, we can 
set each free variable to an arbitrary value, and then solve the resulting square system. Take X2 = r, 
X3 = s, X5 = t, where here r, s, and t are arbitrary real scalar constants. So 




f Xl \ 







£4 



-2fa + fa 




.268) 
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which gives 



which yields 



2 1 
-1 



k-r + s-2t 

-2b 1 + b 2 + At 



Xi = 26i - b 2 - 4i, 
x\ = -(-6i + 6 2 -r + s + 2t). 



.269) 

.270) 
.271) 



Thus 



f Xl \ 

X2 

X4 
\x 5 / 



/|(-6i +6 2 -r + s + 2t)\ 



.272) 



26i - 6 2 - 4i 

t I 

(\{-bx + b 2 )\ {-\\ (\\ / 1 \ 



V 







26i - b 2 





/ 



1 



V / 



+ s 




1 


Vo/ 






-4 
V 1 / 



,r,s,te R 1 . 



.273) 



The coefficients r, s, and £ multiply the three right null space vectors. These in combination with two 
independent row space vectors, form a basis for any vector x. Thus, we can again cast the solution as a 
particular solution which is a unique combination of independent row space vectors and a non-unique 
combination of the right null space vectors (the homogeneous solution) : 



f Xl \ 

X2 



X3 

X5 



25&i - 13&2 
106 



-1 



\XkJ 



(i\ 



-13&1 + 11&2 

106 



V 2/ 



t-*\ 



\ 0/ 



V / 



1 



Vo/ 



(\\ 



.274) 



V 1 / 



right null space 



In matrix form, we can say that 



( Xl \ 

X2 

■'•■:', 



(\ 



Xi 

\x 5 J 



1 



1 

V 2 





-4 
1 ) 



25fri-13fr 2 

106 

-136i+llb 2 

106 

f 



V 



.275) 



/ 



Here we have taken f = r + (61 - 96 2 )/106, s = s + (-61 + 96 2 )/106, and i = (-306i + 266 2 )/106; as 
they are arbitrary constants multiplying vectors in the right null space, the relationship to bx and 6 2 
is actually unimportant. As before, while the null space basis vectors are orthogonal to the row space 
basis vectors, the entire system is not orthogonal. The Gram-Schmidt procedure could be used to cast 
the solution on either an orthogonal or orthonormal basis. 

It is also noted that we have effectively found the L • U decomposition of A. The terms in L are 
from the Gaussian elimination, and we have already U: 



L U 




■1 1 



.276) 







u 
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The L • D • U decomposition is 




(8.277) 



There were no row exchanges, so in effect the permutation matrix P is the identity matrix, and there 
is no need to include it. 

Lastly, we note that a more robust alternative to the method shown here would be to first apply 
the A T operator to both sides of the equation so to map both sides into the column space of A. Then 
there would be no need to restrict b so that it lies in the column space. Our results are then interpreted 
as giving us only a projection of x. Taking A T • A • x = A T • b and then casting the result into row 
echelon form gives 



/l 1/2 -1/2 -1\ /xi\ /(l/22)(6i 

n n n i a ™ fi H 1 \fu 



o o 





Vo o 



1 4 





/ 



■r-2 
\xs I 



+ 7b 2 + 
(l/ll)(6 1 -46 2 






463) \ 

763) » 



.278) 



This suggests we take x 2 = r, X3 = s, and x$ = t and solve so to get 

/ Xl \ /{l/22)(b 1 + 7b 2 + 4b 3 )\ /-l/2\ /l/2\ 



■r-2 

X3 

X4 

\x 5 / 







(l/ll)(6 1 -46 2 -76 3 ) 





+ r 



1 



V ) 





1 



V ) 





-4 
V 1 ) 



.279) 



We could go on to cast this in terms of combinations of row vectors and right null space vectors, but 
will not do so here. It is reiterated that this result is valid for arbitrary b, but that it only represents 
a solution which minimizes the residual in IIA • x— bl U- 



8.8.4 Q • R decomposition 

The Q • R decomposition allows us to formulate a matrix as the product of an orthogonal 
(unitary if complex) matrix Q and an upper triangular matrix R, of the same dimension as 
A. That is we seek Q and R such that 



Q R. 



(8.280) 



The matrix A can be square or rectangular. See Strang for details of the algorithm. It 
can be thought of as a deformation due to R followed by a volume-preserving rotation or 
reflection due to Q. 
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I 

Example 8.31 

The Q • R decomposition of the matrix we considered in a previous example, p. 13641 is as follows: 
= Q R 




/ -0.1808 


-0.4982 


0.8480 \ / 


'27.6586 


-16.4867 


-19.4153\ 


-0.7954 


-0.4331 


-0.4240 





-2.0465 


-7.7722 


\ 0.5785 


-0.7512 


-0.3180/ \ 








1.9080 / 



A Q R 

(8.281) 
Note that det Q = 1, so it is volume- and orientation-preserving. Noting further that HQH2 = lj we 
deduce that ||R|| 2 = ||A|| 2 . And it is easy to show that ||R|| 2 = ||A|| 2 = 37.9423. Also recalling how 
matrices can be thought of as transformations, we see how to think of A as a stretching (R) followed 
by rotation (Q). 

I 



I 

Example 8.32 

Find the Q ■ R decomposition for our non-square matrix from p. 13661 

1-3 2 
2 3 



The decomposition is 



.282) 



_/ 0.4472 -0.8944 \ / 2.2361 -1.3416 3.577 \ , > 

V 0-8944 0.4472/'^ 2.6833 -0.4472/ ' (8.283J 

V v ' V v ' 

Q R 

Once again det Q = 1, so it is volume- and orientation-preserving. It is easy to show ||A|| 2 = ||R|| 2 = 
4.63849. 



I 

Example 8.33 

Give a geometric interpretation of the Q • R decomposition in the context of the discussion sur- 
rounding the transformation of a unit square by the matrix A considered earlier on p. 13481 



-1 

1 -1 



The decomposition is 

-1\ (1 



.284) 



1 I V 1 - 285 > 



Q R 
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original unit square 



sheared element 




A 0.2 0.4 0.6 0.8 1.0 



sheared and 
rotated element 




Figure 8.6: Unit square transforming via explicit stretching (R), and rotation (Q) under a 
linear area- and orientation-preserving alibi mapping. 



Now det A = 1. Moreover, det Q = 1 and detR = 1, so both of these matrices preserve area and 
orientation. As usual ||Q||2 = 1, so its operation preserves the lengths of vectors. The deformation 
is embodied in R which has ||R||2 = II-A-H2 = 1.61803. Decomposing the transformation of the unit 
square depicted in Fig. 18.31 by first applying R to each of the vertices, and then applying Q to each 
of the stretched vertices, we see that R effects an area- and orientation-preserving shear deformation, 
and Q effects a counter-clockwise rotation of it/2. This is depicted in Fig. 18.61 



The Q • R decomposition can be shown to be closely related to the Gram-Schmidt orthog- 
onalization process. It is also useful in increasing the efficiency of estimating x for A • x ~ b 
when the system is over-constrained; that is b is not in the column space of A, R(A). If we, 
as usual operate on both sides as follows, 





A 


X 


r~*^i 


b, b£R(A), 






(8.286) 


A T 


A 


X 


= 


A T b, A = Q R, 






(8.287) 


(Q R) r Q 


R 


X 


= 


(Q R) T b, 






(8.288) 


R T Q T Q 


R 


X 


= 


R T Q T b, 






(8.289) 


1 T Q 1 Q 


R 


X 


= 


R T Q T b, 






(8.290) 


R T 


R 


X 


= 


R T Q r b, 






(8.291) 






X 


= 


(R T -R) _1 -R T -Q T -b, 






(8.292) 


Q 


R 


X 


= 


Q- (R- (R t -R) -1 -R t ) 


Q T 


b, 


(8.293) 




A 


X 


= 


Q (R- (R t -R) _1 -R t ) 

s . 

p 


Q T 


b. 


(8.294) 
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When rectangular R has no zeros on its diagonal, R • (R T • R) • R T has all zeroes, except 
for r ones on the diagonal, where r is the rank of R. This makes solution of over-constrained 
problems particularly simple. We note lastly that Q • R • (R T • R) • R T • Q T = P, a 
projection matrix, defined first in Eq. ( 17.1601) . and to be discussed in Sec. 18.91 

8.8.5 Diagonalization 

Casting a matrix into a form in which all (or sometimes most) of its off-diagonal elements 
have zero value has its most important application in solving systems of differential equations 
but also in other scenarios. For many cases, we can decompose a square matrix A into the 
form 

A = S-A-S"\ (8.295) 

where S is non-singular matrix and A is a diagonal matrix. To diagonalize a square matrix 
A, we must find S, a diagonalizing matrix, such that S _1 • A • S is diagonal. Not all matrices 
are diagonalizable. Note that by inversion, we can also say 

A = S" 1 • A • S. (8.296) 

Considering A to be the original matrix, we have subjected it to a general linear transfor- 
mation, which in general stretches and rotates, to arrive at A; this transformation has the 
same form as that previously considered in Eq. f)7.278p . 

Theorem 

A matrix with distinct eigenvalues can be diagonalized, but the diagonalizing matrix is 
not unique. 

Definition: The algebraic multiplicity of an eigenvalue is the number of times it occurs. The 
geometric multiplicity of an eigenvalue is the number of eigenvectors it has. 

Theorem 

Nonzero eigenvectors corresponding to different eigenvalues are linearly independent. 

Theorem 

If A is an iVx N matrix with TV linearly independent right eigenvectors {ei, e2, • • • , e n , ■ ■ ■ , ejv} 
corresponding to eigenvalues {Ai, A2, • • • , A n , • • • , Ajv} (not necessarily distinct), then the 
N x N matrix S whose columns are populated by the eigenvectors of A 



e x e 2 ... e n ... ejv 

\; i ... i ... ; J 



(8.297) 



makes 

S" 1 • A • S = A, (8.298) 
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where 



/ Xi 

A 2 



\ 



X n 







(8.299) 



V o o x N j 

is a diagonal matrix of eigenvalues. The matrices A and A are similar. 

Let's see if this recipe works when we fill the columns of S with the eigenvectors. First 
operate on Eq. (18.2981) with S to arrive at a more general version of the eigenvalue problem: 



on 



a lN 



a Nl - - - a NN / \ : 







A • S S • A, 






/ : 




: \ 




( : ••• :\ /Ai •• 


ei 




e N 


= 


ei ... e N 


; ' 


\ '■■ 


=s 


'■ ) V i '■■ > 

„ 

=s 


Vo •• 










/ : ... : \ 






= 


Aiei . . . XweN 


• 








^ 


{ ; ••• ; ) 





(8.300) 



\jv 



=S-A 



Rearranging, we get 



A • ei + • ■ 

A • ei + • • 

(A - Ail) • ei 



A • e N = Aiei + • • • + AAre w , 

A • e N = Ail • ei H + AatI • e^. 

. + (A - AjvI) • e N = 0. 

V v ' 

=0 



(8.301) 
(8.302) 

(8.303) 
(8.304) 

(8.305) 



Now {ei, . . . ,ejv} are linearly independent. Thus, this induces TV eigenvalue problems for 
each eigenvector: 



A • ei = Ail • e l7 
A • e 2 = A 2 I • e 2 , 



(8.306) 
(8.307) 



A-e N = X N I-e N . (8.308) 

Note also the effect of post-multiplication of both sides of Eq. (j8.300p by S _1 : 

A-S-S" 1 = S- A-S" 1 , (8.309) 

A = S-A-S" 1 . (8.310) 
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I 

Example 8.34 

Diagonalize the matrix considered in a previous example, p. 13641 

/ -5 4 9 \ 
A= -22 14 18 , 
V 16 -8 -6/ 



and check. See the example around Eq. (I8.245|) . 
The eigenvalue-eigenvector pairs are 

Ai = 



Then 



The inverse is 



Thus, 



and 



= -6, ei= -2 

A 2 = 3, e 2 = 2 , 

A 3 = 6, e 3 = 1 . 



ei e 2 e 3 



-1 1 2 

-2 2 1 

1 2 



A-S 



_ 4 2 1 

_ 9 3 5 



6 3 12 
12 6 6 
-6 12 



A = S" 1 -A-S 



-6 
3 
6 



Let us also note the complementary decomposition of A: 



A = S-A-S _1 



-1 1 2 

-2 2 1 

1 2 



-6 
3 
6 



2 3 3 i 



-5 4 9 

-22 14 18 

16 -8 -6 



.311) 



.312) 

.313) 

.314) 
.315) 

.316) 

.317) 
.318) 
.319) 



.320) 



Note that because the matrix is not symmetric, the eigenvectors are not orthogonal, e.g. ef ■ e 2 
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Note that if A is symmetric (Hermitian), then its eigenvectors must be orthogonal; thus, 
it is possible to normalize the eigenvectors so that the matrix S is in fact orthogonal (unitary 
if complex). Thus, for symmetric A we have 

A = Q A Q 1 . (8.321) 

Since Q _1 = Q T , we have 

A = Q A Q T . (8.322) 

Geometrically, the action of a symmetric A on a geometric entity can be considered as 
volume-preserving rotation or reflection via Q T , followed by a stretching due to A, completed 
by another volume-preserving rotation or reflection via Q, which acts opposite to the effect 
of Q T . Note also that with A • S = S • A, the column vectors of S (which are the right 
eigenvectors of A) form a basis in C N . 



I 

Example 8.35 

Consider the action of the matrix 

A=(l }), (8.323) 

on a unit square in terms of the diagonal decomposition of A. 

We first note that det A = 1, so it preserves volumes and orientations. We easily calculate that 
||A|| 2 = 3/2 + V5/2 = 2.61803, so it has the potential to stretch a vector. It is symmetric, so it has 
real eigenvalues, which are A = 3/2 ± v5/2. Its spectral radius is thus p(A) = 3/2 + v5/2, which is 
equal to its spectral norm. Its eigenvectors are orthogonal, so they can be orthonormalized to form an 
orthogonal matrix. After detailed calculation, one finds the diagonal decomposition to be 

(8.324) 



(" 


5+V5 
10 


\ 1 


2 


Vv 


5+V5 




The action of this composition of matrix operations on a unit square is depicted in Fig. 18.71 The first 
rotation is induced by Q T and is clockwise through an angle of 7r/5. 67511 = 31.717°. This is followed 
by an eigen-stretching of A. The action is completed by a rotation induced by Q. The second rotation 
reverses the angle of the first in a counterclockwise rotation of 7r/5. 67511 = 31.717°. 

I 



Consider now the right eigensystem of the adjoint of A, denoted by A*: 

A*-V = V-A*, (8.325) 

where A* is the diagonal matrix containing the eigenvalues of A*, and V is the matrix 
whose columns are populated by the (right) eigenvectors of A*. Now we know from an 
earlier proof, Sec. 17.4.41 that the eigenvalues of the adjoint are the complex conjugates of 
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original 
unit square 



after first 
rotation 

1 r D 




after eigen-stretching 
D 





Figure 8.7: Unit square transforming via rotation, stretching, and rotation of the diagonal- 
ization decomposition under a linear area- and orientation-preserving alibi mapping. 

those of the original operator, thus A* = A . Also the adjoint operator for matrices is the 
Hermitian transpose. So, we find that 



A H ■ V = V- A H . 

Taking the Hermitian transpose of both sides, we recover 

V H ■ A = A • V H . 



(8.326) 



(8.327) 



So we see clearly that the left eigenvectors of a linear operator are the right eigenvectors of 
the adjoint of that operator. 

It is also possible to show that, remarkably, when we take the product of the matrix of 
right eigenvectors of the operator with the matrix of right eigenvectors of its adjoint, that 
we obtain a diagonal matrix, which we denote as D: 



S H ■ V = D. 



(8.328) 



Equivalently, this states that the inner product of the left eigenvector matrix with the right 
eigenvector matrix is diagonal. Let us see how this comes about. Let Sj be a right eigenvector 
of A with eigenvalue Aj and Vj be a left eigenvector of A with eigenvalue Aj. Then 



J\. • Sj ^jSj, 



and 



vf.A = A,vf. 

If we premultiply the first eigen-relation, Eq. (18.3290 . by vj*, we obtain 

vf • A - Si = vf • (A jSj ) . 



(8.329) 
(8.330) 

(8.331) 



Substituting from the second eigen-relation, Eq. (I8.330P and rearranging, Eq. (I8.331J) becomes 



Aj-v 



H 



Ajvf • Si 



U32) 
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Rearranging 

(A, - A,) (vf • s<) = 0. (8.333) 

Now if i 7^ j and A; ^ \j, we must have 

vf • Si = 0, (8.334) 

or, taking the Hermitian transpose, 

sf • Vj = 0. (8.335) 

If i = j, then all we can say is sf ■ Vj is some arbitrary scalar. Hence we have shown the 
desired relation that S H • V = D. 

Since eigenvectors have an arbitrary magnitude, it is a straightforward process to scale 
either V or S such that the diagonal matrix is actually the identity matrix. Here we choose 
to scale V, given that our task was to find the reciprocal basis vectors of S. We take then 

S H ■ V = I. (8.336) 

Here V denotes the matrix in which each eigenvector (column) of the original V has been 
scaled such that Eq. (18.336P is achieved. Hence V is seen to give the set of reciprocal basis 
vectors for the basis defined by S: 

S R = V. (8.337) 

It is also easy to see then that the inverse of the matrix S is given by 

S" 1 = V H . (8.338) 



I 

Example 8.36 

For a matrix A considered in an earlier example, p. 13631 consider the basis formed by its matrix 
of eigenvectors S, and use the properly scaled matrix of eigenvectors of A* = A^ to determine the 
reciprocal basis S R . 

We will take 

/ -5 4 9 \ 

A = -22 14 18 . (8.339) 

\ 16 -8 -6/ 

As found before, the eigenvalue- (right) eigenvector pairs are 

-2 , (8.340) 

| 2 , (8.341) 

A 3 = 6, e 3R = I 1 J . (8.342) 

(8.343) 
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Then we take the matrix of basis vectors to be 

s I = ( "2 2 1 | 

1 2 
\ : : : / 

The adjoint of A is 

/-5 -22 16 

A H = 4 14 -8 | . (8.345) 

\ 9 18 -6, 

The eigenvalues- (right) eigenvectors of A^ , which are the left eigenvectors of A, are found to be 

2 , (8.346) 

4 , (8.347) 

A 3 = 6, e 3L = 1 ■ (8.348) 

(8.349) 

So the matrix of right eigenvectors of the adjoint, which contains the left eigenvectors of the original 
matrix, is 

-4 -5 -2' 
V = | eii e 2L e 3L | 2 4 1 | . (8.350) 

3 3 

We indeed find that the inner product of S and V is a diagonal matrix D: 

/-l -2 l\ /-4 -5 -2\ /3 
S H ■ V = 1 2 0-2 4 1=03 
\ 2 1 2/ \ 3 3 / \0 -3, 

Using our knowledge of D, we individually scale each column of V to form the desired reciprocal basis 

/-4/3 -5/3 2/3 \ 
V= 2/3 4/3 -1/3 \=S R . (8.352) 

V 1 1 / 

Then we see that the inner product of S and the reciprocal basis V = S R is indeed the identity matrix: 

-1 -2 l\ /-4/3 -5/3 2/3 \ /l 0\ 
S H V=( 1 2 0-2/3 4/3 -1/3 = 10. (8.353) 

2 1 2 / \ 1 1 / \0 1 / 
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8.8.6 Jordan canonical form 

A square matrix A without a sufficient number of linearly independent eigenvectors can still 
be decomposed into a near-diagonal form: 

A = S • J S"\ (8.354) 

This form is known as the Jordan_| (upper) canonical form in which the near-diagonal matrix 
J: 

J = S" 1 -A-S, (8.355) 

has zeros everywhere except for eigenvalues along the principal diagonal and unity above the 
missing eigenvectors. The form is sometimes called a Jordan normal form. 

Consider the eigenvalue A of algebraic multiplicity N — L + l of the matrix AtvxTV- Then 

(A - AI) • e = 0, (8.356) 

gives some linearly independent eigenvectors ei,e2, • • • ,Bl- If L = N, the algebraic multi- 
plicity is unity, and the matrix can be diagonalized. If, however, L < N we need N — L more 
linearly independent vectors. These are the generalized eigenvectors. One can be obtained 
from 

(A - AI) • gl = e, (8.357) 

and others from 

(A - AI) • g i+1 = gi for j = l,2,...,N-L-l. (8.358) 

This procedure is continued until N linearly independent eigenvectors and generalized eigen- 
vectors are obtained, which is the most that we can have in M. N . Then 

/; ... ; \ 

ei ... e L gi ... g7v_ L (8.359) 

\: ... i / 

gives S _1 • A • S = J, where J is of the Jordan canonical form. 
Notice that g n also satisfies (A — AI) ra • g n = 0. For example, if 

(A - AI) • g = e, (8.360) 

(A-AI)-(A-AI)-g = (A-AI)-e, (8.361) 

(A - AI) • (A - AI) • g = 0, (8.362) 

(A-AI) 2 -g = 0. (8.363) 

However any solution of Eq. (I8.363P is not necessarily a generalized eigenvector. 



5 Marie Ennemond Camille Jordan, 1838-1922, French mathematician. 
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I 

Example 8.37 

Find the Jordan canonical form of 



4 1 3 
A = | 4 1 ) . (8.364) 

4 

The eigenvalues are A = 4 with multiplicity three. For this value 

/0 1 3\ 

(A- AI) = 1 . (8.365) 

\ / 

The eigenvectors are obtained from (A — AI) ■ ei = 0, which gives X2 + 3^3 = 0, x$ = 0. The most 
general form of the eigenvector is 

/«\ 
ei = . (8.366) 

Only one eigenvector can be obtained from this eigenvalue. To get a generalized eigenvector, we take 
(A — AI) ■ gi = ei, which gives x<i + 3^3 = a, X3 = 0, so that 

gi = [ a . (8.367) 

Another generalized eigenvector can be similarly obtained from (A— AI) -g2 = gi, so that xi + 3^3 = b, 
X3 = a. Thus, we get 

g 2 = b - 3a 



From the eigenvector and generalized eigenvectors 



(a b c 

a b - 3a I , (8.369) 

a 



and 

J_ b_ — b_-j-3bo4^ca 

a a" 



S" 1 = ( i ^±3a I . ( 8 .370) 

The Jordan canonical form is 



I 



4 1 

1 AS= I 4 1 

4 

Note that in Eq. (|8.370|) . a, 6, and c are any constants. Choosing a = 1, b = c = 0, for example, 
simplifies the algebra giving 

/l \ 

S = 1 -3 , (8.372) 

V 1 J 
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and 

S" 1 = I 1 3 I . (8.373) 




8.8.7 Schur decomposition 

The S churn decomposition is as follows: 

A = Q R Q T (8.374) 

Here Q is an orthogonal (unitary if complex) matrix, and R is upper triangular, with the 
eigenvalues this time along the diagonal. The matrix A must be square. 



I 

Example 8.38 

The Schur decomposition of the matrix we diagonalized in a previous example, p. 13641 is as follows: 

= Q R Q T = (8.375) 




-0.4082 0.1826 0.8944 \ / -6 -20.1246 31.0376 \ / -0.4082 -0.8165 0.4082 \ 
-0.8165 0.3651 -0.4472 • 3 5.7155 • 0.1826 0.3651 0.9129 . (8.376) 
0.4082 0.9129 / V 6 / V 0.8944 -0.4472 / 



Q R Q T 

This decomposition was achieved with numerical software. This particular Q has det Q = — 1, so if it 
were used in a coordinate transformation it would be volume-preserving but not orientation-preserving. 
Since the Schur decomposition is non- unique, it could be easily re-calculated if one also wanted to 
preserve orientation. 

I 



I 

Example 8.39 

The Schur decomposition of another matrix considered in earlier examples, see p. 13481 is as follows: 

J ij), (8.377) 

i+ygj _J_ \ / -i+V3i -1+V& \ / i-vg* J- \ 

V2 2v^/V 2/\y2 2 % /2/ 



U R u" 



e Issai Schur, 1875-1941, Belrussian-born German-based mathematician. 
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This is a non-unique decomposition. Unusually, the form given here is exact; most require numerical 
approximation. Note that because R has the eigenvalues of A on its diagonal, that we must consider 
complex unitary matrices. When this is recomposed, we recover the original real A. Once again, we 
have ||R||2 = H-A-H2 = 1.61803. Here detU = detU ff = 1, so both are area- and orientation-preserving. 
We can imagine the operation of A on a real vector x as an initial rotation into the complex plane 
effected by application of XJ H : x' = XJ H ■ x. This is followed by an eigen-stretching effected by R: 
x" = R ■ x'. Application of U rotates back into the real plane: x"' = U ■ x". The composite effect is 
x"' = U R \J H x = A ■ x. 

I 



If A is symmetric, then the upper triangular matrix R reduces to the diagonal matrix 
with eigenvalues on the diagonal, A; the Schur decomposition is in this case simply A = 
Q A Q T . 

8.8.8 Singular value decomposition 

The singular value decomposition (SVD) is used for non-square matrices and is the most 
general form of diagonalization. Any complex matrix A^xA/ can be factored into the form 



LNxM 



Qnxn ■ B^vxM • QmxMi (8.379) 



where Qnxn and Qf IxM are orthogonal (unitary, if complex) matrices, and B has positive 
numbers \i^ ii = 1, 2, . . . , r) in the first r positions on the main diagonal, and zero everywhere 
else. It turns out that r is the rank of A^vxA/- The columns of Qnxn are the eigenvectors 
of A NxM ■ A^ xM . The columns of Qmxm are the eigenvectors of A^ xM • A NxM - The 
values fii, (i = 1, 2, . . . , r) e R 1 are called the singular values of A. They are analogous to 
eigenvalues and are in fact the positive square roots of the eigenvalues of A^x^u ■ ^-nxM 
or A^ xM • AtvxA/- Note that since the matrix from which the eigenvalues are drawn is 
Hermitian, that the eigenvalues, and thus the singular values, are guaranteed real. Note 
also that if A itself is square and Hermitian, that the absolute value of the eigenvalues of A 
will equal its singular values. If A is square and non-Hermitian, there is no simple relation 
between its eigenvalues and singular values. The factorization Qnxn ■ Bjv x m ■ Qmxm * s 
called the singular value decomposition. 

As discussed by Strang, the column vectors of Qnxn and Qmxm are even more than 
orthonormal. They also must be chosen in such a way that AjvxJwQmxM is a scalar multiple 
of Qnxn- This comes directly from post-multiplying the general form of the singular value 
decomposition by Qmxm' -^-nxM • Qmxm = Qtvxtv ■ Btvxm- So in fact a more robust 
way of computing the singular value decomposition is to first compute one of the orthogonal 
matrices, and then compute the other orthogonal matrix with which the first one is consistent. 



I 

Example 8.4O 

Find the singular value decomposition of the matrix from p. 13661 



,2 3 
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The matrix is real so we do not need to consider the conjugate transpose; we will retain the notation 
for generality though here the ordinary transpose would suffice. First consider A • A H : 

h i 14 




o = I 8 13 I 

A" 

The diagonal eigenvalue matrix and corresponding orthogonal matrix composed of the normalized 
eigenvectors in the columns are 

. / 21.5156 \ _ _/ 0.728827 -0.684698 \ , c „ QO , 

A ^-{ o 5.48439 J' Q2x2 "V 0.684698 0.728827;' (8 ' 382) 

Next we consider A H ■ A: 



H 




(8.383) 



A" 



The diagonal eigenvalue matrix and corresponding orthogonal matrix composed of the normalized 
eigenvectors in the columns are 

/21.52 0\ / 0.4524 0.3301 -0.8285' 

A 3x3 = 5.484 , Q 3X 3 = -0.4714 0.8771 0.09206 | . 

\ 0/ \ 0.7571 0.3489 0.5523 

We take 



/V21.52 0\ 



B2x3 " V o' V5AM o)-{ 'o 2.342 ' 



and can easily verify that 



O B QH fO.7288 -0.6847\ /4.639 0\ / ^ "^ ^\ 

Q2x2-B 2X 3-Q 3 x3 - ( _ 6847 _ 7288 II 2M2 ) 0.3301 8771 0.3489 , 



1 -3 2 

2 3 ' " A2> 



-0.8285 0.09206 0.5523/ 

j 

^3X3 

(8.386) 
(8.387) 



The singular values here are fix = 4.639, fi2 = 2.342. As an aside, both det Q 2 x2 = 1 and det Q3X3 = 1, 
so they are orientation-preserving. 

Let's see how we can get another singular value decomposition of the same matrix. Here we will 
employ the more robust technique of computing the decomposition. The orthogonal matrices Q3X3 and 
Q2x2 are not unique as one can multiply any row or column by —1 and still maintain orthonormality. 
For example, instead of the value found earlier, let us presume that we found 

/ -0.4524 0.3301 -0.8285\ 
Q 3x3 = 0.4714 0.8771 0.09206 . (8.388) 

\ -0.7571 0.3489 0.5523 / 
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Here, the first column of the original Q3X3 has been multiplied by —1. If we used this new Q3X3 in 
conjunction with the previously found matrices to form Q2X2 ■ A2X3 ■ Qfi<3j we would not recover A2X3! 
The more robust way is to take 

A2x3 = Q2x2 • B2x3 ' Q3x3i (8.389) 

^2x3 • Q3x3 = Q2x2 ' B 2 x3, (8.390) 

-0.4524 0.3301 -0.8285 \ , v , x 

0.4714 0.8771 0.09206 =( 9u m ) ( „ . " " ), (8.391) 

-0.7571 0.3489 0.5523 ) W^) \ ° 2M2 °J. 




Q 3 



Q 2 



-3.381 -1.603 0\ _ /4.639gn 2.342<7i 2 
-3.176 1.707 Oj ~ ^4.639^21 2.342 92 2 

Solving for qij, we find that 

_/ -0.7288 -0.6847\ 
142x2 - ( —0.6847 0.7288 J ' 



.392) 



.393) 



It is easily seen that this version of Q2X2 differs from the first version by a sign change in the first 
column. Direct substitution shows that the new decomposition also recovers A2X3: 

/ n79«s nrm x /, fi o Q n nN / -0.4524 0.4714 -0.7571' 
Q 2X 2-B 2 x3-Qfx3 = f"- 'Ml?) 4 f 9 ° M 0.3301 0.8771 0.3489 



-0.6847 0.7288 M 2.342 ^^ ^^ ^ 



Q 2 



Q 3 "x3 

(8.394) 

2 "o 3 s)= A2 - ^ 395 ) 

Both of the orthogonal matrices Q used in this section have determinant of —1, so they do not preserve 
orientation. 

I 



I 

Example 8.41 

The singular value decomposition of another matrix considered in earlier examples, p. 13481 is as 
follows: 




The singular value decomposition here is A = Q2 ■ B ■ Qj\ All matrices are 2x2, since A is 
square of dimension 2x2. Interestingly Q2 = Qi . Both induce a counterclockwise rotation of 
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original unit square 
D C 



1.0 

0.5 

A 



B 



0.0 0.5 1.0 




after first rotation 
B 



1 — 




after eigen-stretching 

after second rotation 



1.0 0.5 0.0 0.5 1.0 




Figure 8.8: Unit square transforming via rotation, stretching, and rotation of the singular 
value decomposition under a linear area- and orientation-preserving alibi mapping. 



a = arcsm 



inJ2/(5 - Vh) = tt/3.0884 



.2°. We also have detQi = detQ 2 = HQ2II2 
|A|| 2 = J(3 + y/E)/2 = 1.61803. 



IQi 



1. 



Thus, both are pure rotations. By inspection ||B||2 

The action of this composition of matrix operations on a unit square is depicted in Fig. 18.81 The 
first rotation is induced by Qj . This is followed by an eigen-stretching of B. The action is completed 
by a rotation induced by Q2. 

I 



It is also easily shown that the singular values of a square Hermitian matrix are identical 
to the eigenvalues of that matrix. The singular values of a square non- Hermitian matrix are 
not, in general, the eigenvalues of that matrix. 

8.8.9 Hessenberg form 

A square matrix A can be decomposed into Hessenberg^ form 



A = Q-H-Q i , 



5.398) 



where Q is an orthogonal (or unitary) matrix and H has zeros below the first sub-diagonal. 
When A is Hermitian, Q is tridiagonal, which is very easy to invert numerically. Also H has 
the same eigenvalues as A. Here the H of the Hessenberg form is not to be confused with 
the Hessian matrix, which often is denoted by the same symbol; see Eq. 01.283]) . 



I 

Example 8.^2 

The Hessenberg form of our example square matrix A from p. 13641 is 



Q H Q J 




.399) 



7 Karl Hessenberg, 1904-1959, German mathematician and engineer. 
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(l \ / -5 2.0586 9.6313 \ /l \ 

-0.8087 0.5882 27.2029 2.3243 -24.0451 -0.8087 0.5882 . (8.400) 

\0 0.5882 0.8087/ I 1.9459 5.6757 / \0 0.5882 0.8087/ 



Q H qt 

The matrix Q found here has determinant of —1; it could be easily recalculated to arrive at an 
orientation-preserving value of +1. 

I 



8.9 Projection matrix 

Here we consider a topic discussed earlier in a broader context, the projection matrix denned 
in Eq. (I7.160J1 . The vector A • x belongs to the column space of A. Here A is not necessarily- 
square. Consider the equation A • x = b, where A and b are given. If the given vector b 
does not lie in the column space of A, the equation cannot be solved for x. Still, we would 



like to find x p such that 

A • x p = b p , (8.401) 

which does lie in the column space of A, such that b p is the projection of b onto the column 
space. The residual vector from Eq. ( 18.4ft is also expressed as 

r = bp - b. (8.402) 

For a projection, this residual should be orthogonal to all vectors A • z which belong to the 
column space, where the components of z are arbitrary. Enforcing this condition, we get 

= (A-z) T -r, (8.403) 

= (A-zf-(bp-b), (8.404) 

r 

= z T ■ A T ■ (A-Xp-b), (8.405) 

b P 
= z T ■ (A T ■ A-Xp- A T -b). (8.406) 



Since z is an arbitrary vector, 



A 1 • A • Xp - A' • b = 0. (8.407) 



from which 



A 1 ■ A ■ Xp = A 1 ■ b, (8.408 



Xp (A T • A)" 1 • A T • b, (8.409) 

A • Xp = A • (A T • A)" 1 • A T • b, (8.410) 

T A\-l kT 



bp = A -(A- 1 • A)" 1 - A 1 h. (8.411 

v „ ' 

ICC BY-NC-THJ} 29 July 2012, Sen & Powers. 



8.9. PROJECTION MATRIX 387 

This is equivalent to that given in Eq. (17.1600 . The projection matrix P defined by b p = Pb 
is 

P = A- (A T - A) -1 - A T . (8.412) 

The projection matrix for an operator A, when operating on an arbitrary vector b yields 
the projection of b onto the column space of A. Note that many vectors b could have the 
same projection onto the column space of A. It can be shown that an N x N matrix P 
is a projection matrix iff P • P = P. Because of this, the projection matrix is idempotent: 
Px = PPx = ...= P n • x. Moreover, the rank of P is its trace. 



I 

Example 8.43 

Determine and analyze the projection matrix associated with projecting a vector bgi 3 onto the 
two-dimensional space spanned by the basis vectors (1,2,3) T and (1, 1, 1) T . 

We form the matrix A by populating its columns with the basis vectors. So 

(8.413) 




The we find the projection matrix P via Eq. (J8.412I) : 





!! ■ ?! • ! \\, 



.414) 



By inspection P is self-adjoint, thus it is guaranteed to possess real eigenvalues, which are A = 1, 1,0. 
There is one non-zero eigenvalue for each of the two linearly independent basis vectors which form A. 
It is easily shown that ||P||2 = 1, p(P) = 1, an d detP = 0. Thus, P is singular. This is because it 
maps vectors in three space to two space. The rank of P is 2 as is its trace. Note that, as required of 
all projection matrices, P • P = P: 

(5 l _I\ / 5 l _I\ / 5 l _I\ 

\ \l){\ 17-!. 17 • 
63 6 / \ 6 3 6 / \ 6 3 6 / 

That is to say, P is idempotent. 

It is easily shown the singular value decomposition of P is equivalent to a diagonalization, giving 

1 l 

o j- 

The matrix Q has HQH2 = 1 and det Q = 1, so it is a true rotation. Thus, when P is applied to a 
vector b to obtain h p , we can consider b to be first rotated into the configuration aligned with the two 
basis vectors via application of Q T . Then in this configuration, one of the modes of b is suppressed 
via application of A, while the other two modes are preserved. The result is returned to its original 
configuration via application of Q, which precisely provides a counter-rotation to Q T . Note also that 
the decomposition is equivalent to that previously discussed on p. 13721 here, A = R ■ (R T ■ R) ■ R T , 
where R is as was defined on p. 13721 Note specifically that P has rank r = 2 and that A has r = 2 
values of unity on its diagonal. 

I 




1 o\ 

10 






1 

V3 


1 

V2 

1 


0/ 


^7e 


[2 

V 3 


1 

v/6 



\CC BY-NC-TW} 29 July 2012, Sen & Powers. 



388 CHAPTER 8. LINEAR ALGEBRA 

8.10 Method of least squares 

One important application of projection matrices is the method of least squares. This method 
is often used to fit data to a given functional form. The form is most often in terms of polyno- 
mials, but there is absolutely no restriction; trigonometric functions, logarithmic functions, 
Bessel functions can all serve as well. Now if one has say, ten data points, one can in princi- 
ple, find a ninth order polynomial which will pass through all the data points. Often times, 
especially when there is much experimental error in the data, such a function may be subject 
to wild oscillations, which are unwarranted by the underlying physics, and thus is not useful 
as a predictive tool. In such cases, it may be more useful to choose a lower order curve which 
does not exactly pass through all experimental points, but which does minimize the residual. 
In this method, one 

• examines the data, 

• makes a non-unique judgment of what the functional form might be, 

• substitutes each data point into the assumed form so as to form an over-constrained 
system of linear equations, and 

• uses the technique associated with projection matrices to solve for the coefficients which 
best represent the given data. 

8.10.1 Unweighted least squares 

This is the most common method used when one has equal confidence in all the data. 



I 

Example 8.44 

Find the best straight line to approximate the measured data relating x to t. 

t x 






5 


1 


7 


2 


10 


3 


12 


6 


15 



.417) 



A straight line fit will have the form 

x = a Q + a 1 t, (8.418) 

where a® and ai are the terms to be determined. Substituting each data point to the assumed form, 
we get five equations in two unknowns: 

5 = oo + 0ai, (8.419) 
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7 


= Oo + lOl, 


10 


= ao + 2ai, 


12 


= ao + 3ai, 


15 


= a + 6ai. 



Rearranging, we get 



(I OX 

1 1 

1 2 

1 3 

Vl 6/ 



r 7 \ 

10 

12 

w 



This is of the form A • a = b. We then find that 

a=(A T .A) _1 .A T -b. 
Substituting, we find that 











(1 


1 


1 1 


l\ 


u 


1 


2 3 


ej 






A T 













( l Ox 

1 1 

1 2 

1 3 

Vl 6/ 



11111 
12 3 6 



A' J 



So the best fit estimate is 



x = 5.7925 + 1.6698 t. 




5.7925 
1.6698 



.420) 
.421) 
.422) 
.423) 



.424) 



.425) 



(8.426) 



The Euclidean norm of the residual is ||A • a — b||2 = 1.9206. This represents the £2 residual 
prediction. A plot of the raw data and the best fit straight line is shown in Fig. 18.91 



8.427) 
of the 



8.10.2 Weighted least squares 

If one has more confidence in some data points than others, one can define a weighting 
function to give more priority to those particular data points. 



I 

Example 8.45 

Find the best straight line fit for the data in the previous example. Now however, assume that we 
have five times the confidence in the accuracy of the final two data points, relative to the other points. 
Define a square weighting matrix W: 



W 



/l OX 

'01000* 

10 
5 

Vo 5/ 



.428) 
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x 




x = 5.7925+ 1.6698 t 



12 3 4 5 6 7 

t 

Figure 8.9: Plot of x — t data and best least squares straight line fit. 

Now we perform the following operations: 

A a = b, 
W A a = W b, 

(W-A) T -W-A-a = (W-A) T -W-b, 

a = ((W • A) T • W • A) (W-A) T -W-b. 



With values of W from Eq. (|8.428[) , direct substitution leads to 

a \ _ ( 8.0008 \ 



So the best weighted least squares fit is 

a; = 8.0008 +1.1972 t. 
A plot of the raw data and the best fit straight line is shown in Fig. 18.101 



(8.429) 
(8.430) 

(8.431) 

(8.432) 

(8.433) 

(8.434) 
I 



When the measurements are independent and equally reliable, W is the identity matrix. 
If the measurements are independent but not equally reliable, W is at most diagonal. If the 
measurements are not independent, then non-zero terms can appear off the diagonal in W. 
It is often advantageous, for instance in problems in which one wants to control a process in 
real time, to give priority to recent data estimates over old data estimates and to continually 
employ a least squares technique to estimate future system behavior. The previous example 
does just that. A famous fast algorithm for such problems is known as a KalmariQ Filter. 



s Rudolf Emil Kahnan, 1930-, Hungarian/ American electrical engineer. 
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x 



weighted data points 




x = 8.0008 + 1.1972 t 



1 



3 4 5 6 7 



Figure 8.10: Plot of x — t data and best weighted least squares straight line fit. 

8.11 Matrix exponential 



Definition: The exponential matrix is defined as 



Thus 



At 



(It 



(e At ) 



I + A + iA' + iA" 



I + At + -A¥ + -A 3 i 3 + • • • 
A + A 2 t + -A 3 t 2 + • • • , 

[ + At + ^A 2 t 2 + ^A 3 £ 3 



= A-e At . 
Properties of the matrix exponential include 



(8.435) 

(8.436) 
(8.437) 
(8.438) 

(8.439) 



e aI 
(e A )- X 

p A(t+ S ) 



e a I, 

-A 



e At e As . 



(8.440) 
(8.441) 
(8.442) 



But e 



A+B A„B 



e A e a only if A B = B A. Thus, e 



tl+sA t sA 
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I 

Example 8.46 

Find e At if 



/ o 1 
A= a 1 

I a 



.443) 



We have 



where 



A = al + B, 



1 
1 




.444) 



(8.445) 



Thus 



Furthermore 



Thus 



A( 



,(aI+B)t 



e ati . m 



/ 1 

B 2 = 

V 















I B = B I = B. 



a + —^aH 2 \ 2 + —a 3 t 3 I 3 + 

V 

+2- 



\ ( 



= 



I + Bt + —B 2 t 2 + —& 3 t 3 
2! 3! 



V 



e at I- I + Bt + B 2 — 



t — 
2 ' 

1 t 
1 



.446) 

.447) 

.448) 
.449) 
.450) 

.451) 

.452) 
.453) 

.454) 

.455) 
.456) 



J 



ICC BY-NC-THJ} 29 July 2012, Sen & Powers. 



8.12. QUADRATIC FORM 393 

If A can be diagonalized, the calculation is simplified. Then 

e At = e s-A-s-H = I + SA . s -i t + . . . + i_ ( S . A . S -i t f . (8 .4 57) 

Noting that 

(S-A-S -1 ) 2 = S- A-S _1 -S- A-S _1 = S- A 2 -S _1 , (8.458) 

(S-A-S -1 )^ = S- A-S _1 -...-S- A-S" 1 = S- A^-S" 1 , (8.459) 

the original expression reduces to 

e At = S- (l + At + ... + ^-(A N t N )Y S"\ (8.460) 

= S-e A *-S _1 . (8.461) 

8.12 Quadratic form 

At times one may be given a polynomial equation for which one wants to determine conditions 
under which the expression is positive. For example if we have 

/(&, 6, 6) = 18£ 2 - 1666 + 5& 2 + 1266 - 466 + 66 2 , (8-462) 

it is not obvious whether or not there exist (67676) which will give positive or negative 
values of /. However, it is easily verified that / can be rewritten as 

/(6, 6, 6) = 2(6 - 6 + 6) 2 + 3(26 - 6) 2 + 4(6 + 6) 2 - (8.463) 

So in this case / > for all (6j6>6)- How to demonstrate positivity (or non-positivity) of 
such expressions is the topic of this section. A quadratic form is an expression 

TV N 

f(^---^N) = J2J2 a ^v (8-464) 

0=1 i=l 

where {aij} is a real, symmetric matrix which we will also call A. The surface represented by 
the equation Ylj=i 12i=i a ij^j = constant is a quadric surface. With the coefficient matrix 
defined, we can represent / as 

/ = e ■ A • 6 (8.465) 

Now, by Eq. (I8.321J) . A can be decomposed as Q • A - Q _1 , where Q is the orthogonal matrix 
populated by the orthonormalized eigenvectors of A, and A is the corresponding diagonal 
matrix of eigenvalues. Thus, Eq. (18.4650 becomes 

f = f Q A Q 1 ■£. (8.466) 

A 
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Since Q is orthogonal, Q T = Q 1 , and we find 



»T 



T 



f = t Q A Q J •£. 



(8.467) 



Now, define x so that x = Q T £ = Q 1 • £. Consequently, £ = Q ■ x. Thus, Eq. (18.4671) 
becomes 



/ 



(Q-x) T -Q- A-x, 
x T Q T Q A x, 
x T Q 1 Q A x, 

x T • A • x. 



(8.468) 
(8.469) 
(8.470) 
(8.471) 



This standard form of a quadratic form is one in which the cross-product terms (i.e. £j£j, 
i 7^ j) do not appear. 

Theorem 

(Principal axis theorem) If Q is the orthogonal matrix and Ai, • • • , Ajy the eigenvalues 
corresponding to {a^}, a change in coordinates 



Q 



(8.472) 



will reduce the quadratic form, Eq. (18.4641) . to its standard quadratic form 

/(xi, . . . , x N ) = \ x x\ + \ 2 x 2 2 H h Aatx^. (8.473) 

It is perhaps better to consider this as an alias rather than an alibi transformation. 



I 

Example 8.47 

Change 

/(&, 6) =2£f + 2&6 + 2& 

to a standard quadratic form. 

For TV = 2, Eq. (|8.464|) becomes 

/(£l, 60 = OllCl + (012 + «2l)Cl6 + 022^2- 

We choose {a^} such that the matrix is symmetric. This gives us 



(8.474) 



an = 


= 2 


012 = 


= 1 


021 = 


= 1 


022 = 


= 2 



.475) 



(8.476) 

(8.477) 
(8.478) 
(8.479) 
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So we get 



2 1 

1 2 



(8.480) 



The eigenvalues of A are A = 1, A = 3. The orthogonal matrix corresponding to A is 

q = f A i V q- 1 = q t = i 

V 72 v^ / 
The transformation £ = Q • x is 



l i_ 

75 V2 

73 73 



1 



(xi +x 2 ), 



V2' 
6 = -^=(-a;i +x 2 ). 



.482) 
.483) 



We have det Q = 1, so the transformation is orientation-preserving. The inverse transformation x 
Q- 1 £ = Q T £is 



Xl 



■1*2 



1 

71 
i 

71 



(6 - 6) 

(6+6) 



.484) 
.485) 



Using Eqs. (|8.482I8.483|) to eliminate £i and £ 2 in Eq. (|8.474|) . we get a result in the form of Eq. (|8.473|) : 

/(si,x a )=x? + 3a|. (8.486) 

In terms of the original variables, we get 

/(6,6) = 5(6 " 6) 2 + 5(6 + 6) 2 - (8.487) 



J 



I 

Example 8.48 

Change 

/(6, 6, 6) = ise? - 1666 + 56 2 + 1266 - 466 + 66 2 , 

to a standard quadratic form. 



For N = 3, Eq. (|8.464|) becomes 

/(6,6,6) = (6 6 6; 



18-8 6 \ /6 

-8 5 -2 6 

6-26/ \6 



€ T -A-€. 



.489) 



The eigenvalues of A are Ai = 1, A 2 = 4, A3 = 24. The orthogonal matrix corresponding to A is 



/ y%9 7M 



Q 



7 

769 
2 
/69 



2- -3,/-?- 
V 115 

5 rr 

6 V 46 



230 \ 
/ 



( 



Q _1 = Q 3 



V69 

1 

' 730 

13 
/230 



V69 76S[ \ 

/ 



115 V 46 



.490) 
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For this non- unique choice of Q, we note that det Q = —1, so it fails to satisfy the requirements 
of a right-handed coordinate system. For the purposes of this particular problem, this fact has no 
consequence. The inverse transformation x = Q _1 • £ = Q T • £ is 

Xl = W^~W^ + W^ (8 ' 491) 



*> = -^ + ViV 2+ v^ 3 ' (8 - 492) 

* 3 = ^ 1 - 3 VnS 6 + V / 5 & - (8 ' 493) 

Directly imposing then the standard quadratic form of Eq. (|8.473|) onto Eq. (|8.488p . we get 

f{ Xl ,X2,x 3 )=xl+ 4x 2 2 + 24x 2 3 . (8-494) 

In terms of the original variables, we get 

+24 (^- 3 ^ 2+ ^ 3 ) • (8 - 495) 

It is clear that /(£i,£2,£3) is positive definite. Moreover, by performing the multiplications, it is easily 
seen that the original form is recovered. Further manipulation would also show that 

/(&, 6, 6) = 2(£i - 6 + 6) 2 + 3(2& - £ 2 ) 2 + 4(6 + £ 3 ) 2 , (8.496) 

so we see the particular quadratic form is not unique. 



8.13 Moore-Penrose inverse 

We seek the Moore- Penrosqj inverse: A^ IxN such that the following four conditions are 
satisfied 

Ajvxm ■ A MxAr • A^xAf = Ajvxiw, (8.497) 

Aa/xtv ' AyvxAf • A MxAr = A MxN , (8.498) 

(AjvxAf • A Mx7V ) = Ajvxm • A MxJV , (8.499) 

(Ajvfxiv ' Ajvxm) = A MxAr • A NxM . (8.500) 



after Eliakim Hastings Moore, 1862-1932, American mathematician, and Sir Roger Penrose, 1931-, En- 
glish mathematician. It is also credited to |Arne Bjerhammar[ 1917-2011, Swedish geodesist. 
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This will be achieved if we define 



L MxiV 



Qmxm • B Mx7V • Q NxN . (8.501) 

The matrix B + is M xN with /x" 1 , {i = 1, 2, . . .) in the first r positions on the main diagonal. 
This is closely related to the N x M matrix B, defined in Sec. I8.8.8( having /ii on the same 
diagonal positions. The Moore-Penrose inverse, A MxN , is also known as the pseudo-inverse. 
This is because in the special case in which N < M and N = r that it can be shown that 



Ajvxm • A Mx7V — ItVxTV- 
Let's check this with our definitions for the case when TV < M, N = r. 



-NxM 



L MxN 



NxN 



(Q 

QnxN 
QnxN 
Q,NxN 
QnxN 
QnxN 

Inxn- 



B 



NxM " QmxmJ ' (Qa/xA/ 
' Qa/xA/ ' QmxM 



B 



\H 



MxN ' QnxN) ) 



BjVxiU ■ ^iMxM 
BjvxM - '-'MxN 
1-NxN ■ Q;v 



B 



MxN 



Q 



H 

NxNi 



BC\ H 
A/f w AT ' *°£]\[ 



xNi 



INxNi 



QnxNi 



Q/V> 



NxNi 



(8.502) 

8.503) 
8.504) 
8.505) 
8.506) 
8.507) 
8.508) 
8.509) 



We note for this special case that precisely because of the way we defined B + that B^xAf 

^MxN = InxN- Wh( 

and zeros elsewhere. 



Bjuxiv = f-NxN- When A^ > M, Bjv x m -B^^ yields a matrix with r ones on the diagonal 



I 

Example 8.49 



Find the Moore-Penrose inverse, Ag X 2, of A2x3 from the matrix of a previous example, p. 13661 



^2x3 



-3 2 
3 



.510) 



A+ 



L 3x2 



A+ 

-^-3x2 



Q3x3 • B 3x2 ■ 

0.452350 

-0.471378 

0.757088 



Note that 



Q2x2i 

0.330059 

0.877114 
0.348902 



-0.828517\ 
0.0920575 
0.552345 / 



-0.0254237 0.169492 \ 
-0.330508 0.20339 
0.0169492 0.220339 / 



4.6385 





2.3419 





1-2x3 



L 3x2 



0.728827 
-0.684698 



0.684698 
0.728827 



-0.0254237 0.169492\ 
-0.330508 0.20339 
0.0169492 0.220339/ 



1 
1 



Both Q matrices have a determinant of +1 and are thus volume- and orientation-preserving. 



.511) 

1 

.512) 
.513) 

.514) 
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I 

Example 8.50 

Use the Moore-Penrose inverse to solve the problem Ax = b studied in an earlier example, p. 13411 



so 



1 2_ 

V5 V5 



so taking A • Q 1 = Q2 ■ B, gives 



1 2 



VE VE 

Q2 



3 6 ; \ik rk I V<?2i © a ; v 0/' 



=Qi 



Solving, we get 



9n\ = ( vTo 



Imposing orthonormality to find qi2 and (722, we get 

/ \ / 3 



V 922 / V"^o. 
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1 2W*!W2 

3 6) \x 2 J V° 

We first seek the singular value decomposition of A, A = Q2 • B • Q^ . Now 

*-•*=(; Dd ;)=(is s)- <»■»«> 

The eigensystem with normalized eigenvectors corresponding to A ff • A is 

Ai = 50, ei = f f J , (8.517) 

A 2 = 0, e 2 = f f 1 , (8.518) 



Qi = Y _T I' ( 8 - 519 ) 



.521) 



(3 0) = ("vis; S)- 



(8.523) 



(8.524) 



Q 2 =(J® J^Y (8.525) 

V %/To v/io / 
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and 



A = Q 2 .B.Qf=(<F **).(*f ° n ).( k f) = (ll)- (8-526) 





\ I — * i / I 3 6 



As an aside, note that Qi is orientation-preserving, while Q2 is not, though this property is not 
important for this analysis. 

We will need B + , which is easily calculated by taking the inverse of each diagonal term of B: 



1 







B { "f J • (8 - 527) 

Now the Moore-Penrose inverse is 



Q 1 .B+-Qf= f f • {*& I • f v^ = f f . (8.528) 




1 


3 


y 


5 <? 


',11 


oO 



Direct multiplication shows that A ■ A + ^ I. This is a consequence of A not being a full rank matrix. 
However, the four Moore-Penrose conditions are satisfied: A • A + • A = A, A + • A • A + = A + , 

(A • A+) H = A ■ A+, and (A+ • A) H = A+ • A. 

Lastly, applying the Moore-Penrose inverse operator to the vector b to form x = A + • b, we get 

50 50 / \ u / \ 25 / 

A+ b 

We see that the Moore-Penrose operator acting on b has yielded an x vector which is in the row space 
of A. As there is no right null space component, it is the minimum length vector that minimizes the 
residual ||A ■ x — b|| 2 . It is fully consistent with the solution we found using Gaussian elimination in 
an earlier example, p. 13411 

I 



Problems 

1. Find the x with smallest llxlk which minimizes IIA • x — blk for 



1 





3 \ 




/I 


2 


-1 


3 ' 


b = 


° 


3 


-1 


5/ 




\1 



2. Find the most general x which minimizes ||A ■ x — t> 1 1 2 for 
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3. Find x with the smallest ||x||2 which minimizes ||A • x— t> 1 1 2 for 

/l 1 4 \ / 2 

A= 1 2 -1 , b= 1 
\2 1 3 -2/ \ -3 



4. Find e A if 



1 1 1 
A= 3 2 

\ 5 

5. Diagonalize or reduce to Jordan canonical form 

/5 2 -1\ 

A= 5 1 . 
\ 5 / 

6. Find the eigenvectors and generalized eigenvectors of 

(1 1 1 1\ 

111 

1 

\ / 

7. Decompose A into Jordan form S • J • S~ , P _1 ■ L • D • U, Q • R, Schur form, and Hessenberg form 

^0101^ 

10 10 

10 1 
\ 1 1 ) 

Find the matrix S that will convert the following to the Jordan canonical form 

( 6 -1-3 1 \ 

16 1-3 

3 16-1 
\ 1 -3-1 6 / 

/ 8 -2-2 \ 

6 2-4 

-2 8-2 

\ 2 -4 6 / 



(b) 



and show the Jordan canonical form. 



9. Show that the eigenvectors and generalized eigenvectors of 



/ 1 1 2 \ 

13 

2 2 

\ 1 / 



span the space. 



10. Find the projection matrix onto the space spanned by (1, 2, 3) T and (2, 3, 5) T . Find the projection of 
the vector (7, 8,9) T onto this space. 

11. Reduce Ax 2 + Ay 2 + 2z 2 — Axy + Ayz + Azx to standard quadratic form. 
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12. Find the inverse of 



1/4 1/2 3/4 
3/4 1/2 1/4 
1/4 1/2 1/2 



/ i 

13. Find exp 1 

\ 1 

14. Find the nth power of 

15. If 



1 3 
3 1 



5 4 
1 2 

find a matrix S such that S _1 • A • S is a diagonal matrix. Show by multiplication that it is indeed 
diagonal. 

16. Determine if A = I and B = I I are similar. 

17. Find the eigenvalues, eigenvectors, and the matrix S such that S _1 • A • S is diagonal or of Jordan 
form, where A is 




18. Put each of the matrices above in L ■ D ■ U form. 

19. Put each of the matrices above in Q ■ R form. 

20. Put each of the matrices above in Schur form. 

21. Let 




Find S such that S 1 ■ A • S = J, where J is of the Jordan form. Show by multiplication that 
A-S = S- J. 

22. Show that 

A / cos(l) sin(l) 



sin(l) cos(l) 

if 

1 

-1 
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23. Write A in row echelon form 



24. Show that the function 



10 
2-200 
10 12 



f(x, y, z) = x 2 + y 2 + z 2 + yz - zx - xy, 
is always non-negative. 

25. If A : £1 -» £l, find ||A|| when 

-(I ? 

Also find its inverse and adjoint. 

26. Is the quadratic form 

f(xi,x 2 ,x 3 ) = \x\ + 2x\Xi + 4a;ix 3 , 

positive definite? 

27. Find the Schur decomposition and Cholesky decompositions of A: 

/ \ 

1-30 

0-310' 
\ 0/ 

28. Find the x with minimum ||x||2 which minimizes ||A ■ x — b||2 in the following problems: 

(a) 



(b) 



29. In each part of the previous problem, find the right null space and show the most general solution 
vector can be represented as a linear combination of a unique vector in the row space of A plus an 
arbitrary scalar multiple of the right null space of A. 

30. An experiment yields the following data: 

t x 






0.00 1.001 

0.10 1.089 

0.23 1.240 

0.70 1.654 

0.90 1.738 

1.50 2.120 

2.65 1.412 

3.00 1.301 



We have fifteen times as much confidence in the first four data points than we do in all the others. 
Find the least squares best fit coefficients a, b, and c if the assumed functional form is 
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(a) x = a + bt + ct 2 , 

(b) x = a + bsmt + csin2t. 

Plot on a single graph the data points and the two best fit estimates. Which best fit estimate has the 
smallest least squares residual? 

31. For 




a) find the P 1 ■ L ■ D ■ U decomposition, and 

b) find the singular values and the singular value decomposition. 

32. For the complex matrices A find eigenvectors, eigenvalues, demonstrate whether or not the eigenvec- 
tors are orthogonal, find (if possible) the matrix S such that S _1 • A • S is of Jordan form, and find 
the singular value decomposition if 

2 + i 2 
2 1 




33. Consider the action of the matrix 



3 1 

2 2 



on the unit square with vertices at (0,0), (1,0), (1,1), and (0,1). Give a plot of the original unit 
square and its image following the alibi mapping. Also decompose A under a) Q • R decomposition, 
and b) singular value decomposition, and for each decomposition plot the series of mappings under 
the action of each component of the decomposition. 
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Chapter 9 
Dynamical systems 



see Kaplan, Chapter 9, 

see Drazin, 

see Lopez, Chapter 12, 



see Hirsch and Smale, 



see I Guckenheimer] and\Holmes. 

see 



Wiggins 



see 



Strogatz. 



In this chapter we consider the evolution of systems, often called dynamic systems. Generally, 
we will be concerned with systems which can be described by sets of ordinary differential 
equations, both linear and non-linear. Some other classes of systems will also be studied. 

9.1 Paradigm problems 

We first consider some paradigm problems which will illustrate the techniques used to solve 
non-linear systems of ordinary differential equations. Systems of equations are typically more 
complicated than scalar differential equations. The fundamental procedure for analyzing 
systems of non-linear ordinary differential equations is to 

• Cast the system into a standard form. 

• Identify the equilibria of the system. 

• If possible, linearize the system about its equilibria. 

• If linearizable, ascertain the stability of the linearized system to small disturbances. 

• If not linearizable, attempt to ascertain the stability of the non-linear system near its 
equilibria. 

• Solve the full non-linear system. 

405 
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9.1.1 Autonomous example 

First consider a simple example of what is known as an autonomous system. An autonomous 
system of ordinary differential equations can be written in the form 

Notice that the independent variable t does not appear explicitly 



I 

Example 9.1 

For x g R 2 , t g R 1 , f : K 2 -> M 2 , consider 

— = x 2 - x 1 = h{x 1 ,x 2 ), (9.2) 

*£ = x 2 - Xl = f 2 ( Xl , X2 ). (9.3) 

The curves defined in the (xi, x 2 ) plane by f% = and f 2 =0 are very useful in determining both the 
fixed points (found at the intersection) and in the behavior of the system of differential equations. In fact 
one can sketch trajectories of paths in this phase space by inspection in many cases. The loci of points 
where f\ = and f 2 = are plotted in Fig. 19.11 The zeroes are found at (xi,x 2 ) T = (0j0) t , (1, 1) T . 
Linearize about both points by neglecting quadratic and higher powers of deviations from the critical 
points to find the local behavior of the solution near these points. Near (0,0), the linearization is 

(9.4) 

(9.5) 



This is of the form 





dx\ 

-A = * 2 ' 

dx 2 

—— = x 2 -x 1 , 
dt 


d 


(xA ( l\ (xi 


dt 


\x 2 ) \-l l)\x 2 




<ix 

-— = A • x. 

dt 



And with 



where S is a constant matrix, we get 



(9.6) 

(9.7) 
S-z = x, (9.8) 



|(S-z) = S-^=A-S-z, (9.9) 

— = S _1 -A-S-». (9.10) 

dt 

At this point we assume that A has distinct eigenvalues and linearly independent eigenvectors; other 
cases are easily handled. If we choose S such that its columns contain the eigenvectors of A, we will 
get a diagonal matrix, which will lead to a set of uncoupled differential equations; each of these can be 
solved individually. So for our A, standard linear algebra gives 



(9-11) 

I ' \ i 1 

V3 2 
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spiral source 
node 



saddle 
node 




Figure 9.1: Phase plane for dx\/dt = x<i — x\, dx-ijdt = x<i — X\, along with equilibrium 
points (0,0) and (1,1), separatrices X2 — x\ = 0, X2 — X\ = 0, solution trajectories, and 
corresponding vector field. 



With this choice we get the eigenvalue matrix 

S" 1 AS= 2 2 



I _L VE.A 
2^2* 



So we get two uncoupled equations for z: 

dz\ 
~dt 



dz 2 
~dt 



'i _ Vs. 

2 ~ ~2 



■l Zl, 



=Ai 



k + T*)** 



(9.12) 



(9.13) 



(9.14) 



which have solutions 



1 Vs. 



Cl exp..--— i t 



(9.15) 
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zi = c 2 exp( (1 + ^.iU). (9.16) 

Then we form x by taking x = S • z so that 

*i = ^ + #) c 1 -pf^-v^) + ^-#) C2exp ff^ + v^)' (9 - 17) 



x 2 = Cl expni-^hj+ C2 expni + ^jij. (9.18) 



Since there is a positive real coefficient in the exponential terms, both x\ and x 2 grow exponentially. 
The imaginary component indicates that this is an oscillatory growth. Hence, there is no tendency for 
a solution which is initially close to (0, 0), to remain there. So the fixed point is unstable. 
Consider the next fixed point near (1, 1). First define a new set of local variables: 



Then 



Expanding, we get 



X\ = x\ — 1, 

X 2 = X 2 - 1. 



it = ^ = (^ + i)-(* 1 + i) 2 , 

dx 2 dx 2 .„ . ,_ , 



-^ = (5a + 1) - 5? - 25i - 1, 
^ = (S a + l)_(£ 1 + l). 



Linearizing about {x\,X2) = (0,0), we find 

dx\ 



■ x 2 -2xi, 
dt 

dx 2 

— = x 2 -xi, 
dt 



dt\x 2 I -1 lJUj 



Going through an essentially identical exercise gives the eigenvalues to be 

1 \/5 



Ai = 


-2 + - >0 ' 


A 2 = 


= -i-^<o, 

2 2 



9.19) 
9.20) 



9.21) 
9.22) 

9.23) 
9.24) 

9.25) 
9.26) 

9.27) 

9.28) 
9.29) 
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which in itself shows the solution to be essentially unstable since there is a positive eigenvalue. After 
the usual linear algebra and back transformations, one obtains the local solution: 



xi = l + ci — - — exp -- + —*+ c 2 — - — exp ----- i, (9.30) 





x 2 = 1 + ciexp -- + — t +c 2 exp ---— i . (9.31) 



Note that while this solution is generally unstable, if one has the special case in which C\ = 0, that the 
fixed point in fact is stable. Such is characteristic of a saddle node. 

As an interesting aside, we can use Eq. (|6.371[) to calculate the curvature field for this system. With 
the notation of the present section, the curvature field is given by 



V(f T ■ F ■ F T ■ f)(f T ■ f) - (f T • F T ■ f) 5 



(9.32) 



(fT . f\ 3 / 2 

where F, the gradient of the vector field f , is given by the analog of Eq. (|6.370|) 

F = Vf T . (9.33) 



So with 



f _ ( h{xi,x 2 )\ _ (x 2 -x\\ _ ( gj7 g^\ _ (-2xi -l\ 

f -(h(xi, X2 )-[x 2 - X ih F "l#£, gJ-l i ij' (9 - 34) 



detailed calculation reveals that 

Y (— x\ + x\ + x\ + xix 2 — x\x 2 — 2x\x 2 — x\ + 2xix\Y 
(x\ + xf - 2x1x2 - 2xf X2 + 2x 2 2 f 12 



(9.35) 

A plot of the curvature field is shown in Fig. 19.21 Because n varies over orders of magnitude, the contours 
are for In k to more easily visualize the variation. Regions of high curvature are noted near both critical 
points and in the regions between the curves x 2 = x\ and x 2 = x\ for xi S [0, 1]. Comparison with 
Fig. I9.1l reveals consistency. 

I 



9.1.2 Non-autonomous example 

Next, consider a more complicated example. Among other things, the system as originally 
cast is non- autonomous in that the independent variable t appears explicitly Additionally, 
it is coupled and contains hidden singularities. Some operations are necessary in order to 
cast the system in standard form. 



1 

Example 9.2 










For x € 


E- 


,t£R\f: 


M 2 x R 1 


— y M , analyze 








dxi dx 2 

t ^r +X2Xl ^r 








dx\ 

X1 1T 


2 dx 2 

+ Xn = ; 

2 dt 






xi (0) 


= Xw, 


X 2 (0) = ; 



xi+t = fi(xi,x 2 ,t), (9.36) 

xit = f 2 (xi,x 2 ,t), (9.37) 

x 20 . (9.38) 
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Figure 9.2: Contours of hiK, where K is trajectory curvature for trajectories of solutions to 



dxi/dt = x 2 — x\, dx2/dt = x 2 — X\. Separatrices x 2 



■n 



and x 2 — X\ = are also 



plotted. Red shading corresponds to large trajectory curvature; blue shading corresponds to 
small trajectory curvature. 



Let 



and further 2/1 = xi, yi = X2, 2/3 



dt _ 
ds 



*(0) = 0, 



t. Then with s G E 1 , y e M 3 , g : 



2/3 



dyi 

ds 



2/22/1 



2/1 



^2/1 
ds 



2/i(0) = 2/10, 
In matrix form, we have 



dy2 

ds 

2^2/2 

dy-i 

ds 
2/2(0) 



2/1 +2/3 
2/12/3 

1 
2/20, 



= 91(2/1, 2/2, 2/3) 

= 52(2/1,2/2,2/3), 

33(2/1,2/2,2/3), 
2/3(0) = 0. 





(9.39) 

(9.40) 

(9.41) 

(9.42) 
(9.43) 

(9.44) 



Inverting the coefficient matrix, we obtain the following equation which is in autonomous form: 




/ yiV2-vlv3+y2y3 \ 

W22/3-2/? " I / /ll(2/l,2/2,2/3) 



V 



a/i(y|-yi-y3) 

V2{viV3-yf 

1 



/ 



^2(2/1,2/2,2/3) 
^3(2/1,2/2,2/3), 



(9.45) 



There are potential singularities at 2/2 = and 2/22/3 = 2/i- Under such conditions, the determinant of 
the coefficient matrix is zero, and dyi/ ds is not uniquely determined. One way to address the potential 
singularities is by defining a new independent variable me! 1 via the equation 



ds 
du 



2/2 (2/22/3 - vl) 



(9.46) 
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The system, Eq. (|9.45|) . then transforms to 



2/i 
du \ 

,2/3. 



'V2 G/12/2 -2/12/3+2/22/3) 



2/1 (2/3 - 2/1 
2/2 (2/22/3 - 



2/3) 



Hi) 



pi(yi,V2,ya) 

^2(2/1,2/2,2/3) 
^3(2/1,2/2,2/3), 



(9.47) 



This equation actually has an infinite number of fixed points, all of which lie on a line in the three- 



dimensional phase volume. The line is given parametrically by (2/1,2/2,2/3) 



(0,0, w) 



V £ 



Here v is just a parameter used in describing the line of fixed points. However, it turns out in this 
case that the Taylor series expansions yield no linear contribution near any of the fixed points, so 
we don't get to use the standard linear analysis technique! The problem has an essential non-linear 
essence, even near fixed points. More potent methods would need to be employed, but the example 
demonstrates the principle. Figure [9~3l gives a numerically obtained solution for 2/1 (w), 2/2 (w), 2/3 ( u ) 
along with a trajectory in (2/1,2/2,2/3) space when 2/1 (0) = 1,2/2(0) = —1,2/3(0) = 0. This corresponds to 
a;i(t = 0) = l,x 2 (t = 0) = -1. 




Figure 9.3: Solutions for one set of initial conditions, ?/i(0) = 1, 2/2(0) = — 1, 2/3(0) = 0, 
for second paradigm example: trajectory in phase volume (j/i, J/2, 2/3); also yi(u), y 2 (u) , y 3 (u) 
and cci(t), x 2 (t). Here y x = Xi, y 2 = x 2 , 2/3 = t. 

We note that while the solutions are monotonic in the variable u, that they are not monotonic 
in t, after the transformation back to xi(t),X2(t) is effected. Also, while it appears there are points 
(u = 0.38, u = 0.84, u = 1.07) where the derivatives dyi/du 7 dy2/du,dy3/du become unbounded, closer 
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inspection reveals that they are simply points of steep, but bounded, derivatives. However at points 
where the slope dy^/du = dt/du changes sign, the derivatives dx\/dt and dxijdt formally are infinite, 
as is reflected in the cyclic behavior exhibited in the plots of X\ versus t or X2 versus t. 

I 



9.2 General theory 

Consider x G R N , t G R\ g : R N x R N x R 1 
differential-algebraic equations takes on the form 



jiV 



dx 



A general non-linear system of 



(9.48) 



Such general problems can be challenging. Let us here restrict to a form which is quasi- 
linear in the time-derivatives. Thus, consider x G K. , t G R 1 , A : R^ x 



WvBl > TO>N ^ toTV 



i N x R 1 



,N 



. Then the quasi-linear problem of the form 

t/x 
A(x,t)- — = f(x,t), x(0) = x o , 

can be reduced to autonomous form in the following manner. With 

xi \ /an(x,t) ... a 1N (x,t)\ { h(x u . ,x N ,t) 



(9.49) 



A(x,t) 



XN. 



\Ojvi(x, 



f(x,t) 



t) ... a NN (x,t), 



\f N {xi,.. 



:XN,t) . 



(9.50) 



define s G R 1 such that 



(ft 

<7,s 



t(0) = 0. 



(9.51) 



Then define y G R N+ \ B : R N+1 -> R^ 1 x R iV+1 , g : R^ 1 -> R N+ \ such that along with 
s G R 1 that 



Vn 
\ Vn+i J 

( «n(y) 



X N 

ouv(y) o\ 



(9.52) 



B(y) 



a^i(y 
V o 



•• a NN (y) o 
.. 1/ 

/ »i(yi,---,i/jv+i) \ / fi(x±,. .,x N ,t)\ 



(9.53) 



g(y) 



9n{vi, ■ ■ ■ ,Vn+i) 
\9N+i(yi,---,VN+i)/ 



In{xi, ■ ■ ■ ,x N ,t) 

1 J 



(9.54) 
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Equation (I9.49P then transforms to 



B(y) • ^ = g(y). (9.55) 

as 



By forming B 1 , assuming B is non-singular, Eq. (19.551) can be written as 

^=B- 1 (y)-g(y), (9.56) 

or by taking 

B- 1 (y)-g(y)^h(y), (9.57) 

we get the form, commonly called autonomous form, with s G R\y G R^" 1-1 ,]! : M. N+l —> 
R N+1 : 

| = h(y). (9.58) 

If B(y) is singular, then h has singularities. At such singular points, we cannot form a 
linearly independent set of dy/ds, and the system is better considered as a set of differential- 
algebraic equations. If the source of the singularity can be identified, a singularity-free 
autonomous set of equations can often be written. For example, suppose h can be rewritten 
as 

h(y) = ^, (9.59) 

where p and q have no singularities. Then we can remove the singularity by introducing the 
new independent variable ueR 1 such that 

|-,(y). (9.60) 



Using the chain rule, the system then becomes 

dy p(y) 



ds q(y) ' 



(9.61) 



^ = ,(y)5M, (9.62) 

du ds q{y) 

d -f = P(y), (9-63) 

du 

which has no singularities. 

Casting ordinary differential equations systems in autonomous form is the starting point 
for most problems and most theoretical development. The task from here generally proceeds 
as follows: 

• Find all the zeroes of h. This is an algebra problem, which can be topologically difficult 
for non-linear problems. 
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• If h has any singularities, redefine variables in the manner demonstrated to remove the 
singularity 

• If possible, linearize h (or its equivalent) about each of its zeroes 

• Perform a local analysis of the system of differential equations near zeroes. 

• If the system is linear, an eigenvalue analysis is sufficient to reveal stability; for non- 
linear systems, the situation is not always straightforward. 

9.3 Iterated maps 

A map / : E — > M. N can be iterated to give a dynamical system of the form 

x k n +1 = f n (x k 1: 4 • • • , 4), n = 1, • • • , N. (9.64) 

Given an initial point x° , (n = 1, . . . , N) in M. N , a series of images x x w x^, 4 • • • can be found 
as A; = 0, 1,2,.... The map is dissipative or conservative according to whether the diameter 
of a set is larger than that of its image or the same, respectively, i.e. if the determinant of 
the Jacobian matrix, det df n /dxj < 1. 

The point Xi = x~i is a fixed point of the map if it maps to itself, i.e. if 

x n = f n (x 1 ,x 2 ,---,x N ), n = l,---,N. (9.65) 

The fixed point x n = is linearly unstable if a small perturbation from it leads the images 
farther and farther away. Otherwise it is stable. A special case of this is asymptotic stability 
wherein the image returns arbitrarily close to the fixed point. 

A linear map can be written as x^ 1 = 2j=i ^j4 (« = 1, 2, . . .) or x fc+1 = A • x fc . The 
origin x = is a fixed point of this map. If ||A|| > 1, then ||x fc+1 || > ||x fc ||, and the map is 
unstable. Otherwise it is stable. 



I 

Example 9.3 

Examine the linear stability of the fixed points of the logistics map, popularized by MayfJ 

x k+1 =rx k (l-x k ), (9.66) 

We take r £ [0,4] so that x k £ [0, 1] maps onto x k+1 £ [0, 1]. That is, the mapping is onto itself. 
The fixed points are solutions of 

x = rx(l — x), (9.67) 

which are 

x = 0, x = l--. (9.68) 

r 



Robert McCredie May, 1936-, Australian- Anglo ecologist. 
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Consider the mapping itself. For an initial seed x°, we generate a series of x k . For example if we take 
r = 0.4 and x° = 0.3, we get 



x 1 =0.4(0.3)(l-0.3) 

x 2 = 0.4(0.084)(1- 0.084) 

a; 3 = 0.4(0.0307776)(1 - 0.0307776) 

a; 4 = 0.4(0.0119321)(1 - 0.0119321) 

a; 5 = 0.4(0.0047159)(1 - 0.0047159) 



0.3, 

0.084, 

0.0307776, 

0.0119321, 

0.0047159, 

0.00187747, 



(9.69) 
(9.70) 
(9.71) 
(9.72) 
(9.73) 
(9.74) 



x°° = 0. (9.75) 

For this value of r, the solution approaches the fixed point of 0. Consider r = 4/3 and x° = 0.3 



x 1 = (4/3)(0.3)(l - 0.3) 



0.3, 

0.28, 



x 2 = (4/3)(0.28)(l - 0.28) = 0.2688, 

x 3 = (4/3)(0.2688)(l- 0.2688) = 0.262062, 

ir 4 = (4/3)(0.262062)(l- 0.262062) = 0.257847, 

x 5 = (4/3) (0.257847) (1- 0.257847) = 0.255149, 



(9.76) 

(9.77) 
(9.78) 
(9.79) 
(9.80) 
(9.81) 



0.250 = 1 



(9.82) 



In this case, the solution was attracted to the alternate fixed point. 

To analyze the stability of each fixed point, we give it a small perturbation x. Thus, x + x is 
mapped to5+i, where 



x + x = r(x + x)(l — x — x) = r(x — x + x — 2xx + x ). 

Neglecting small terms, we get 

x + x = r(x — x + x — 2xx) = rx(l — x)+ rx(l — 2x). 

Simplifying, we get 

x = rx{\ — 2x). 



(9.83) 



(9.84) 



(9.85) 



A fixed point is stable if \x/x\ < 1. This indicates that the perturbation is decaying. Now consider 
each fixed point in turn. 



x = Q: 



x = ra(l-2(0)), 
x = rx, 



(9.86) 

(9.87) 

(9.88) 



This is stable if r < 1. 
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1 - 1/r: 



1 
1- - 

r 



(2-r)s, 



x u -- 


= 0.3, 


x 1 =3.2(0.3)(l-0.3) = 


= 0.672, 


x 2 = 3.2(0.672)(1 - 0.672) = 


= 0.705331 


3.2(0.705331)(1- 0.705331) = 


= 0.665085 


3.2(0. 665085)(1- 0.665085) = 


= 0.71279, 


= 3.2(0.71279)(1- 0.71279) = 


= 0.655105 


3.2(0.655105) (1- 0.655105) = 


= 0.723016 


3.2(0. 723016)(1- 0.723016) = 


= 0.640845 


3.2(0.640845)(1- 0.640845) = 


= 0.736521 


x 00 - 1 -- 


= 0.799455 


x°° -- 


= 0.513045 



(9.89) 
(9.90) 
(9.91) 



This is unstable for r < 1, stable for 1 < r < 3, unstable for r > 3. 

What happens to the map for r > 3. Consider r = 3.2 and x° = 0.3 

(9.92) 
(9.93) 
(9.94) 
(9.95) 
(9.96) 
(9.97) 
(9.98) 
(9.99) 
(9.100) 

(9.101) 
(9.102) 

This system has bifurcated. It oscillates between two points, never going to the fixed point. The two 
points about which it oscillates are quite constant for this value of r. For greater values of r, the system 
moves between 4,8, 16, ... points. Such is the essence of bifurcation phenomena. A plot, known as a 
bifurcation diagram, of the equilibrium values of x as a function of r is given in Fig. 19.41 

I 



Other maps that have been studied are: 
• Henodj map: 



Xk+l 

Vk+i 



y k + 1 - ax 
bx k . 



For a = 1.3, b = 0.34, the attractor is periodic, while for a 
has a strange attractor. 

Dissipative standard map: 

x k+1 = x k + y k+ i mod 2tt, 
y k+1 = \y k + ksmx k . 

If A = 1, the map is area preserving. 



(9.103) 
(9.104) 

1.4,6 = 0.34, the map 



(9.105) 
(9.106) 



2 Michel Hcnon, 1931-, French mathematician and astronomer. 
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Figure 9.4: Bifurcation diagram of x = lim^oo x k as a function of r for the logistics map, 



x 



fc+1 



rx k (l - x k ) for r G [0,4]. 



9.4 High order scalar differential equations 



An equation with x G K , t G K , a : M 1 x 



1 tcIJl n .»lvBl^ TCpJV £ . ml __. ml 



/: 



of the form 



+ ajv(a;,t) H ha 2 (x,t)— + ai(x,t)ic = /(t), 



dt" 



tft"- 



d£ 



(9.107) 



can be expressed as a system of n + 1 first order autonomous equations. Let x = yi, 
dx/dt = y 2 ,---,d N - 1 x/dt N ~ 1 = y N ,t = y N+1 . Then with y G R N+1 ,s = i G R\a : 



™ N ,f : M 1 -^R 1 , 



dyi 

ds 


= 2/2 


dy2 

ds 


= 2/3 


dy N _i 




ds 


Uh 


dy N 





ds 



dyN+i 
ds 



aN{yi,yN+i)yN- a N -i{yi,yN+i)yN-i 



l. 



(9.108) 
(9.109) 

(9.110) 

ai(yi,yN+i)yi + f(vN+i), 

(9.111) 
(9.112) 
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I 

Example 9.4 

For x € K , t € R , consider the forced Duffing equation: 



d?x 



+ X + X a 



sin(2t), 



. , dx 

* (0) = °' It 



0. 



(9.113) 



t=o 



Here ci2(x, t) = Q,ai(x,t) = 1 + x 2 ,f(t) = sin(2i). Now this non-linear differential equation with 
homogeneous boundary conditions and forcing has no analytic solution, ft can be solved numerically; 
most solution techniques require a recasting as a system of first order equations. To recast this as an 
autonomous set of equations, with i/Gl^sGl 1 , consider 



x = J/i, 



dx 
~db 



2/2, 



t = s = y 3 . 



Then d/dt = d/ds, and the equations transform to 



2/1 

5l .E. 



2/2 

"2/1 - 2/i + sin(22/ 3 ) 



^1(2/1,2/2,2/3) 

^2(2/1,2/2,2/3) 

,/i3(2/l,2/2,2/3). 




(9.114) 



(9.115) 



Note that this system has no equilibrium point as there exists no 2/ for which h = 0. Once the numerical 
solution is obtained, one transforms back to (x,t) space. Fig. 19.51 gives the trajectory in the (2/1,2/2,2/3) 
phase space and a plot of the corresponding solution x(t) for t G [0, 50]. 



y 2 =dx/dt 1/ f-2£/ 1 ^ X 



y 3 = 





Figure 9.5: Phase space trajectory and solution x(t) for forced Duffing equation. 
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9.5 Linear systems 



For a linear system the coefficients ojv, ■ ■ ■ , fl2> Oi in equation (19.1071) are independent of x. 

. to at . to jv f . mi _, r n^ any Hnear system may be 



In general, for x G R^, t G E 1 , A 



L* x R N , f 



written in matrix form as 



where 



o?x 



A(t)-x + f(t), 



x 2 (t) 



( On(t) Oi 2 (t) 
021 (*) «22 (*) 

y a N1 (t) a N2 (t) 

/ hit) \ 

m 

\ fsit) j 



a 1N (t) \ 
a 2 N{t) 

a NN(t) J 



(9.116) 



(9.117) 



(9.118) 



(9.119) 



Here A and f are known. The solution can be written as x = x# + xp, where x# is the 
solution to the homogeneous equation, and xp is the particular solution. 



9.5.1 Homogeneous equations with constant A 



For xg R N ,te K\A G 



)JV „ TD>N 



the solution of the homogeneous equation 
dx 

where A is a matrix of constants is obtained by setting 

x = ee At , 
with a constant vector e G M. N . Substituting into Eq. (j9. 120ft . we get 



Aee 



M 



Ae 



A • ee At , 
Ae. 



This is an eigenvalue problem where A is an eigenvalue and e is an eigenvector. 
In this case there is only one fixed point, namely the null vector: 

x = 0. 



(9.120) 

(9.121) 

(9.122) 
(9.123) 

(9.124) 
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9.5.1.1 TV eigenvectors 

We will assume that there is a full set of eigenvectors even though not all the eigenvalues are 
distinct. If ei, &2, ■ ■ ■ > &n are the eigenvectors corresponding to eigenvalues Ai, A2, • • • , Atv, 
then 

N 

x = ^c n e n e A " f , (9.125) 

n=l 

is the general solution, where C\, C2, ■ ■ ■ , Cjy are arbitrary constants. 



I 

Example 9.5 

For x g R 3 , t e K 1 , A e K 3 x R 3 , solve dx/rfi = A • x where 



1 -1 4 

3 2 -1 I. (9.126) 



x 2 1 -1 

The eigenvalues and eigenvectors are 



4 

2 , (9.128) 

1 



Thus, the solution is 



Expanding, we get 



4 e * + c 2 2 e 3t +c 3 1 ) e~ 2t . 



Xl (t) = -cie*+c 2 e 3t -c 3 e" 2 *, (9.131) 

x 2 (t) = 4cie' + 2c 2 e 3 *+c 3 e- 2 *, (9.132) 

x 3 (t) = cie* + C2e 3t + c 3 e- 2 *. (9.133) 



I 

Example 9.6 

For x e R 3 , t e E 1 , A g M 3 x K 3 , solve dx/rfi = A ■ x where 



\ -1 1 
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2 -1 -1 \ 

2 1 -1 • (9.134) 
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The eigenvalues and eigenvectors are 

Ai = 2, ei = ( 1 J , (9.135) 

s 2 = I -i (9.13i 

Thus, the solution is 

x = cA 1 \e 2t +c 2 l -i J e a-H)* + C8 [ i j e d-)*, (9.138) 

\ / cos { \ / sin i 

ci ( 1 )e 2 *+c' 2 ( sini ] e* + c 3 ( -cost | e*, (9.139) 



-1 / \ cost I \ sini 



(9.140) 



where 4 = c 2 + c 3 , C3 = z(c 2 - c 3 ). 



9.5.1.2 < A^ eigenvectors 

One solution of cbt/dt = A • x is x = e At • e, where e is a constant vector. If ei, e2,- • •, e^ 
are linearly independent vectors, then x n = e At ■ e„, n = 1, • • • , N, are linearly independent 
solutions. We would like to choose e n , n = 1, 2, • • • , iV, such that each e At • e„ is a series with 
a finite number of terms. This can be done in the following manner. Since 

e At . e = e XIt ■ e (A " AI)t • e, (9.141) 

= e At I • e (A " AI)f • e, (9.142) 

= e A* e (A-AI)t . e (9143 ) 

= e xt (l + (A-XI)t+ QVa - AI) V + • • -V e. (9.144) 



the series will be finite if 



for some positive integer k. 



(A-AI) fc -e = 0, (9.145) 
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9.5.1.3 Summary of method 

The procedure to find x n , [n = 1, 2, . . . , N), the iV linearly independent solutions of 

dx 

— = A x, 9.146 

at 

where A is a constant, is the following. First find all eigenvalues A n , n = 1, • • • , N, and as 
many eigenvectors e&, i = 1, 2, • • • , K as possible. 

1. If K = N, the N linearly independent solutions are x„ = e Xnt e n . 

2. If K < N, there are only K linearly independent solutions of the type x^ = e Xkt ej s . 
To find additional solutions corresponding to a multiple eigenvalue A, find all linearly 
independent g such that (A — AI) 2 • g = 0, but (A — AI) -g^O. Notice that generalized 
eigenvectors will satisfy the requirement, though it has other solutions as well. For each 
such g, we have 

e At -g = e A '(g + t(A-AI)-g), (9.147) 

which is a solution. 

3. If more solutions are needed, then find all linearly independent g for which (A — AI) 3 • 
g = 0, but (A — AI) 2 -g^O. The corresponding solution is 

-At „ \t l ■' ■ » -* L 



e At • g = e At I g + t(A - AI) • g + -(A - AI) 2 • g I . (9.148) 

4. Continue until iV linearly independent solutions have been found. 
A linear combination of the A^ linearly independent solutions 

TV 

x = ^c n x n , (9.149) 

n=l 

is the general solution, where Ci, c 2 , . . . , Cjy are arbitrary constants. 

9.5.1.4 Alternative method 

As an alternative to the method just described, which is easily seen to be equivalent, we can 
use the Jordan canonical form in a straightforward way to arrive at the solution. Recall that 
the Jordan form exists for all matrices. We begin with 

dx 

- = A-x. (9.150) 

Then we use the Jordan decomposition, Eq. (18.3541) . A = S • J • S _1 to write 

— = S-J-S" 1 -x. (9.151) 

at 
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If we apply the matrix operator S _1 , which is a constant, to both sides, we get 

j (s^x) = J-S^x. (9.152) 

\ =z / =z 

Now taking z = S _1 • x, we get 

rfz 

-=J.z. (9.153) 

We then solve each equation one by one, starting with the last equation dz^/dt = \n%n, 
and proceeding to the first. In the process of solving these equations sequentially, there will 
be feedback for each off-diagonal term which will give rise to a secular term in the solution. 
Once z is determined, we solve for x by taking x = S • z. 

It is also noted that this method works in the common case in which the matrix J is 
diagonal; that is, it applies for cases in which there are n differential equations and n ordinary 
eigenvectors. 



I 

Example 9.7 

For x g K 3 , t £ K 1 , A g R 3 x K 3 , find the general solution of 



where 



^=A-x, (9.154) 



4 1 3 

4 1). (9.155) 

4 



A has an eigenvalue A = 4 with multiplicity three. The eigenvector is 



which gives a solution 



A generalized eigenvector is 



which leads to the solution 



) , (9.156) 






1 



e 4t ( gl +*(A-AI). gl ) = e 4t (( 1 j+tf 1 j j 1 jj, (9.159) 

= e 4t I 1 ] . (9.160) 
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Another generalized eigenvector 

g 2 = -3 , (9.161) 



gives the solution 



e 4t (g 2 + i(A - AI) • g 2 + % - (A - AI) 2 • g 2 

, 2 



e« | | -3 + 1 1 -3 + - -3 | | , (9.162) 

1/ \ / \ 1 / \ / \ 1 



,4* ' 2 



e 4t ( -3 + t | • (9.163) 



The general solution is 



) + c 2 e 4t I 1 I + c 3 e 4t I -3 + t 



where Ci,C2, c 3 are arbitrary constants. 

Alternative method 

Alternatively, we can simply use the Jordan decomposition to form the solution. When we form 
the matrix S from the eigenvectors and generalized eigenvectors, we have 

1 
e gi g 2 | ( 1 -3 | . (9.165) 

1 



We then get 



I 0\ 
S" 1 = 1 3 , (9.166) 

\0 1/ 

(A 1 0\ 

J = S" 1 AS= 4 1 . (9.167) 

\0 4/ 

Now with z = S" 1 • x, we solve dz/dt = J ■ z, 

s©-(ili)©- 

The final equation is totally uncoupled; solving dz^/dt = Az^, we get 

z 3 (t) = c 3 e 4t . (9.169) 

Now consider the second equation, 

^ = 4^ 2 + z 3 . (9.170) 
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Using our solution for z 3 , we get 

Solving, we get 

Now consider the first equation, 

Using our solution for z% , we get 



^ = 4z 2 + c 3 e 4t . (9.171) 

z 2 {t) = c 2 e 4 * + c 3 te 4 '. (9.172) 



dz\ 

-±=A Zl + z 2 . (9.173) 



^i = 4z! + c 2 e 4t + c 3 te 4 *. (9.174) 

at 



Solving, we get 

z-,(£\ =r.ip 4t + t 



Zl (t) = cie 4 ' + -te u (2c 2 + te 3 ) . (9.175) 



so we have 



' cie 4 ' + \te At (2c 2 + fcs) ' 



z(t) = I c 2 e 4 ' + c 3 te 4t (9.176) 

c 3 e 4 ' 



Then for x = S • z, we recover 



f 



I ) + c 2 e 4 ' j 1 ) + c 3 e 4t j -3 + 1 



which is identical to our earlier result. 



I 

Example 9.8 

Examine the linear homogeneous system dx/dt = A ■ x in terms of an explicit finite difference 
approximation and give a geometric interpretation of the of the combined action of the differential and 
matrix operator on x. 

A first order explicit finite difference approximation to the differential equation takes the form 



fc+i _ k 

— = A-x fe , (9.178) 

x fc+i = x fc + A£A-x fc , (9.179) 

= (I + AiA)-x fe . (9.180) 



Let us decompose A into a symmetric and anti-symmetric part: 



A + A T , 

A s = , (9.181) 

A- A T 

A a = , (9.182) 
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so that A = A s + A a . Then Eq. (|9.180|) becomes 

x fe+1 = (1 + AtA s + AtA a ) -x fc . (9.183) 

Now since A s is symmetric, it can be diagonally decomposed as 

A S = Q A s Q T , (9.184) 

where Q is an orthogonal matrix, which we will restrict to be a rotation matrix, and A s is a diagonal 
matrix with the guaranteed real eigenvalues of A s on its diagonal. It can also be shown that the 
anti-symmetric A a has a related decomposition, 

A a = U-A a -U ff , (9.185) 

where U is a unitary matrix and A a is a diagonal matrix with the purely imaginary eigenvalues of A a 
on its diagonal. Substituting Eqs. (|9.184|9.185|) into Eq. (|9.183j) . we get 

x fc+1 = I I + At Q ■ A s ■ Q T +At U ■ A a ■ U g J ■ x k . (9.186) 

V ~X~ ~x~ ) 

Now since Q ■ Q T = I = Q ■ I ■ Q T , we can operate on the first and third terms of Eq. (|9. 186[) to get 

x fc+1 = (Q • I • Q T + AtQ • A s • Q T + AtQ • Q T • U • A Q • U ff • Q • Q T ) • x fc , (9.187) 

x fc+1 = Q (I + AtA s + AiQ T ■ U • A a • U ff • Q) • Q T • x fc , (9.188) 

Q T -x fc+1 = Q T • Q • (I + AtA s + A£Q T • U ■ A a • U H ■ Q) • Q T ■ x fe . (9.189) 
=i 

Now, let us define a rotated coordinate system as x = Q T ■ x, so that Eq. (|9. 189|) becomes 

x fc+1 = 1+ AiA s + AtQ T -U-A Q -U g -Q I ■ x fc . (9.190) 

\ stretching rotation / 

This rotated coordinate system is aligned with the principal axes of deformation associated with A s . 
We see that the new value, x fc+1 , is composed of the sum of three terms: 1) the old value, due to the 
action of I, 2) a stretching along the coordinate axes by the term AtA s , and 3) a rotation, normal to 
the coordinate axes by the term AtQ T • U • A a ■ XJ H ■ Q. We note that since both Q and U have a 
norm of unity, that it is the magnitude of the eigenvalues, along with At that determines the amount of 
stretching and rotation that occurs. Note that although A Q and U have imaginary components, when 
combined together, they yield a real result. 

I 



9.5.1.5 Fundamental matrix 

If x n , n = 1, • • • , N, are linearly independent solutions of tix/dt = A • x, then 

/ : : : \ 



n 



Xi X 2 ... X;v 



(9.191) 
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is called a fundamental matrix. The general solution is 

x = $7 • c, 

where 

/ ci \ 



The term e At = Q(t) ■ $~2 _1 (0) is a fundamental matrix. 



I 

Example 9.9 

Find the fundamental matrix of the previous example problem. 



The fundamental matrix is 



1 t £ 



n = e 4t I 1 -3 + t 

1 



so that 



1 * T W c i 

x = J1-c = e 4t 1 -3 + i c 2 

1 / \ c 3 



(9.192) 



(9.193) 



(9.194) 



(9.195) 



9.5.2 Inhomogeneous equations 

If A is a constant matrix that is diagonalizable, the system of differential equations repre- 
sented by 

-^ = A-x + f(t), (9.196) 

at 

can be decoupled into a set of scalar equations, each of which is in terms of a single dependent 

variable. From Eq. (j8.296p . let S be such that S _1 • A • S = A, where A is a diagonal matrix 

of eigenvalues. Taking x = S • z, we get 

d(S ■ z) 



(It 



A-S-z + f(t), 



c/z 



S- — = A-S-z + f(t). 
at 



(9.197) 

(9.198) 



Applying S x to both sides, 




Tt = A - z + ^)' 



(9.199) 

(9.200) 
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where A = S _1 • A • S and g(t) = S _1 • f(£). This is the decoupled form of the original 
equation. 



I 

Example 9.10 

For x e R 2 ,t € R 1 , solve 



-^ = 2xi+x a + l, (9-201) 

dX2 

— - = xi+2x 2 + t. 9.202 

at 



This can be written as 



We have 



so that 



The solution is 



l(:W^)(s) + 0t 



i I). s -'=(| V)' A =(« ° 



d* V *2 7 3 U2 2 HI 



2 
2 

9 6' 



which, using X\ = Z\ + z 2 and x 2 = — z± + z 2 transforms to 



9 3' 

2 

9 ~ 3 



I 

Example 9.11 

Solve the system 



(9.205) 



Zl = ae* + -, (9.206) 



z 2 = be 3t , (9.207) 



(9.208) 



xi = ae f + be 3t - - + -, (9.209) 



2 It 
x 2 = -ae* + 6e 3 * . (9.210) 



J 



dx 

— = A-(x-x ) + b, x(i ) = x . (9.211) 

at 



Such a system arises naturally when one linearizes a non-linear system of the form dx/dt = f (x) 
about a point x = x . Here then, A is the Jacobian matrix A = <9f/9x| x=Xo . Note that the system is 
in equilibrium when 

A-(x-x ) = -b, (9.212) 
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x = x -A~ 1 -b. (9.213) 

Further note that if b = 0, the initial condition x = x is also an equilibrium condition, and is the 
unique solution to the differential equation. 
First define a new dependent variable z: 

z = x-Xo + A" 1 -b. (9.214) 

So we have 

x = z + x - A -1 -b. (9.215) 

At t = t , we then get 

z(i ) = A" 1 -b. (9.216) 

Then substitute into the original differential equation system to get 

— (z + x - A^ 1 -b) = A- (z- A" 1 -b) +b, z(t ) = A^ 1 -b, (9.217) 

at v ' 

dz 

— = A z, z(t ) = A -1 -b. (9.218) 

at 

Now assume that the Jacobian is fully diagonalizable so that we can take A = S • A • S _1 . Thus, we 

have 

dz 

— = S-A-S" 1 -z, z(i )=A _1 -b. (9.219) 

Take now 

w = S _1 -z, z = Sw, (9.220) 

so that the differential equation becomes 

— (S-w) = S- A-w, S-w(0 = A" 1 -b. (9.221) 

dt 

Since S and S" 1 are constant, we can apply the operator S _1 to both sides of the differential equation 
system to get 

S -1 - — (S-w) = S -1 -S-A-w, S" 1 -S-w(U =S _1 • A" 1 • b, (9.222) 

at 

— (S _1 -S-w) = I- A-w, I • w(i ) = S" 1 • A" 1 • b, (9.223) 

^ = A-w, w(i ) = S _1 • A -1 • b, (9.224) 

(9.225) 

This is in diagonal form and has solution 

w(t) = e A (*-*°) • S" 1 • A" 1 • b. (9.226) 

In terms of z, then the solution has the form 

z(i) = S ■ e A (*-*») ■ S^ 1 • A" 1 ■ b. (9.227) 

Then using the definition of z, one can write the solution in terms of the original x as 

x(i) = Xo + (s ■ e A (*-^ • S- 1 - i) • A" 1 • b. (9.228) 
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Note that the time scales of evolution are entirely determined by A; in particular the time scales of 
each mode, n, are Tj = 1/A,, where Ai is an entry in A. The constant vector b plays a secondary role 
in determining the time scales. 

Lastly, one infers from the discussion of the matrix exponential, Eq. (|8.461j) . that e A '* _ *°' = 
S • e ( t_ *°) ■ S _1 , so we get the final form of 

x (i) = x + (e A(t - io) - i) ■ A" 1 • b. (9.229) 



9.5.2.1 Undetermined coefficients 

This method is similar to that presented for scalar equations. 



I 

Example 9.12 

For xgR 3 ,t g R\AgR 3 x R 3 ,f : R 1 -> R 3 , solve dx/rfi = A ■ x + f (t) with 

4 1 3 \ / 3e* 

4 1, f = ) . (9.230) 

4/ \ 

The homogeneous part of this problem has been solved before. Let the particular solution be 

x P = ce*. (9.231) 

Substituting into the equation, we get 

ce 1 = A ■ ce z - | e\ (9.232) 



We can cancel the exponential to get 



which can be solved to get 



(I- A)c= | ) , (9.233; 

. (9.234) 



j 

Therefore, 

-1 



x = x ff + | \e\ (9.235) 



J 



The method must be modified if f = ce xt , where A is an eigenvalue of A. Then the 
particular solution must be of the form xp = (c + tci + t 2 c 2 + • • -)e xt , where the series is 
finite, and we take as many terms as necessary. 
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9.5.2.2 Variation of parameters 

This follows the general procedure explained in Section 13.3.21 page [HO 

9.6 Non-linear systems 

Non-linear systems can be difficult to solve. Even for algebraic systems, general solutions do 
not exist for polynomial equations of arbitrary degree. Non-linear differential equations, both 
ordinary and partial, admit analytical solutions only in special cases. Since these equations 
are quite common in engineering applications, many techniques for approximate numerical 
and analytical solutions have been developed. Our purpose here is more restricted; it is to 
analyze the long-time stability of the solutions as a function of a system parameter. We will 
first develop some of the basic ideas of stability, and then illustrate them through examples. 

9.6.1 Definitions 

With x G K , t G R 1 , / : R — ► R , consider a system of TV non-linear first-order ordinary 
differential equations 



CLX n 



f n {xi,x 2 ,--- ,x N ), n=l,---,N. (9.236) 



where t is time, and f n is a vector field. The system is autonomous since /„ is not a function 
of t. The coordinates x\, x%, ■ ■ ■ , x^ form a phase or state space. The divergence of the vector 
field, div/„ = J2 n=1 df n /dx n , indicates the change of a given volume of initial conditions 
in phase space. If the divergence is zero, the volume remains constant, and the system is 
said to be conservative. If the divergence is negative, the volume shrinks with time, and the 
system is dissipative. The volume in a dissipative system eventually goes to zero. This final 
state to which some initial set of points in phase space goes is called an attractor. Attractors 
may be points, closed curves, tori, or fractals (strange). A given dynamical system may have 
several attractors that co-exist. Each attractor has its own basin of attraction in M. N ; initial 
conditions that lie on this basin tend to that particular attractor. 

The steady state solutions x n = x n of Eq. (19.2360 are called critical (or fixed, singular or 
stationary) points. Thus, by definition 

f n (xi,x 2 , ■■■,x N ) = 0, n=l,---,N, (9.237) 

which is an algebraic, potentially transcendental, set of equations. The dynamics of the 
system are analyzed by studying the stability of the critical point. For this we perturb the 
system so that 

Xn X n ~\~ X n , yu.Zoo J 

where the ~ denotes a perturbation. If \\x n \\ is bounded for t — > oo, the critical point is said 
to be stable, otherwise it is unstable. As a special case, if \\x n \\ — > as t — > oo, the critical 
point is asymptotically stable. 
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i 

Example 9.13 

te some of the prop< 

d ( x\\ ( —l\(x\ 



le 9.13 

Evaluate some of the properties of non-linear systems for the degenerate case of the linear system 



(9.239) 
dt \x 2 J V 1 ~ 1 J \ x 2. 

This is of the form dx/dt = A • x. This particular alibi mapping f = A • x was studied in an 
earlier example in Sec. 18.41 Here f\ = —X2 and j% = x i ~ x i defines a vector field in phase space. Its 
divergence is 

divf = |A + M = -1 = -1, (9.240) 

OX\ OX2 

so the system is dissipative; that is, a volume composed of a set of points shrinks with time. In this 
case the equilibrium state, /j = 0, exists at a unique point, the origin, x-i = x\ = 0, 2:2 = 2^2 = 0. The 
eigenvalues of A = dfi/dxj are —1/2 ± V3i/2. Thus, p(A) = | — 1/2 ± y3i/2\ = 1, the equilibrium is 
stable, and the basin of attraction is the entire Xi,X2 plane. 

Note that det A = 1, and thus the mapping A • x is volume- and orientation-preserving. We also 

find from Eq. (|7.301[) that ||A||2 = y (3 + v5)/2 = 1.61803, so A operating on x tends to lengthen x. 
This seems to contradict the dissipative nature of the dynamical system, which is volume-shrinking! A 
way to reconcile this is to consider that the mapping of a vector x by the dynamical system is more 
complicated. Returning to the definition of the derivative, the dynamical system can also be expressed, 
using the so-called "implicit" formulation, as 



-1 



„fc+i 



lim k+ ¥ k = lim I " ; ) ( 1 +1 I . (9.241) 

At->0 V ^2 - g 2 I At— 0\1 -1/ \X 2 J 

Had the right-hand side been evaluated at k instead of k + 1, the formulation would be known as 
"explicit." We have selected the implicit formulation so as to maintain the proper dissipative property 
of the continuous system, which for this problem would not be obtained with an explicit scheme. We 
demand here that liniAt— x\ = x%,i = 1,2. We focus small finite At, though our analysis allows for 
large At as well, and rearrange Eq. (|9.241[) to get 

=^ +1 \ / -AA (x k x +1 

^ +1 j " I At -At J I 4+ 1 



1 At 

-At 1 + At 



(9.242) 
(9.243) 



x k+1 



fc+l \ / l+At -At 

5+i = 1+A At At2 1+At i +A " ) [11 ) ■ 

2 / \ 1+At+At 2 1+At+At 2 



So our dynamical system, for finite At, is appropriately considered as an iterated map of the form 

x fc+1 = B • x fc , (9.245) 



where 



The matrix B has 



/ l + At -At \ 

B = 1+A j+ At2 1+At ^ At2 . (9.246) 

V 1+At+At 2 1+At+At 2 / 

detB = — r. (9.247) 

1 + At + At 2 y ' 
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For At > 0, det B < 1 indicating a shrinking of the volume element, consistent with div f < 0. The 
eigenvalues of B are 

r r^, 9.248 

1 + At + At 2 v ' 

which for small At expand as 

1- (l±V&) — - + ... (9.249) 

More importantly, the spectral norm of B is the square root of the largest eigenvalue of B ■ B T . Detailed 
calculation reveals this, and its series expansion in two limits, to be 



1 + At + |At 2 + At J I + At + |At 2 

l|B " 2 = 1 + 2At + 3At 2 + 2At 3 + At 4 ' (9 ' 250) 

At 2 

lim ||B|| 2 = 1 -— + ..., (9.251) 

At^O 2 



3 + VE 1 ||A|| 2 

In both limits of At, we see that ||B||2 < 1; this can be shown to hold for all At. It takes on a value 
of unity only for At = 0. Then, since ||B||2 < 1, VAt, the action of B on any x is to diminish its 
norm; thus, the system is dissipative. Now B has a non-zero anti-symmetric part, which is typically 
associated with rotation. One could show via a variety of decompositions that the action of B on a 
vector is to compress and rotate it. 

I 



9.6.2 Linear stability 

The linear stability of the critical point is determined by restricting the analysis to a small 
neighborhood of the critical point, i.e. for small values of ||xj||. We substitute Eq. (19.2380 
into Eq. (19.2360 . and linearize by keeping only the terms that are linear in Xj and neglecting 
all products of 5j. Thus, Eq. (I9.236P takes a linearized local form 

^ = JT Anj x j . (9.253) 



Another way of obtaining the same result is to expand the vector field in a Taylor series 
around Xj = Xj so that 

N 

Xj + . . . , (9.254) 

which has neglecting the higher order terms. Thus, in Eq. (19.2530 



fnfa) J2 dx n 



An= 9 _k 

nj dxi 



(9.255) 

= Xj 
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is the Jacobian of f n evaluated at the critical point. In matrix form the linearized equation 
for the perturbation x is 

f = A-x. (9.256) 

The real parts of the eigenvalues of A determine the linear stability of the critical point 
x = 0, and the behavior of the solution near it: 

• If all eigenvalues have real parts < 0, the critical point is asymptotically stable. 

• If at least one eigenvalue has a real part > 0, the critical point is unstable. 

• If all eigenvalues have real parts < 0, and some have zero real parts, then the critical 
point is stable if A has k linearly independent eigenvectors for each eigenvalue of 
multiplicity k. Otherwise it is unstable. 

The following are some terms used in classifying critical points according to the real and 
imaginary parts of the eigenvalues of A. 

Classification Eigenvalues 

Hyperbolic Non-zero real part 

Saddle Some real parts negative, others positive 

Stable node or sink All real parts negative 

ordinary sink All real parts negative, imaginary parts zero 

spiral sink All real parts negative, imaginary parts non-zero 

Unstable node or source All real parts positive 

ordinary source All real parts positive, imaginary parts zero 

spiral source All real parts positive, imaginary parts non-zero 

Center All purely imaginary and non-zero 

Figures 19.61 and 19.71 show examples of phase planes for simple systems which describe 
an ordinary source node, a spiral sink node, an ordinary center node, and a saddle node. 
Figure MM gives a phase plane, vector field, and trajectories for a complex system with many 
nodes present. Here the nodes are spiral and saddle nodes. 
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dx/dt=0 



2 -^N-^^ 



>N 




dx/dt = x 
dy/dt = y 



dy/dt=0 



dx/dt = 



dy/dt = 



>N 




dx/dt = - (x+y) 
dy/dt = x 



Figure 9.6: Phase plane for system with ordinary source node and spiral sink node. 
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>- 



>v 




dx/dt = - y 
dy/dt =x 



dx/dt =y-x 
dy/dt =x 



Figure 9.7: Phase plane for systems with center node and saddle node. 
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dy/dt = 



dx/dt = (y + 1 /4 x- 1 /2)(x- 2y 2 +5/2) 

dy/dt = (y - x)(x-2)(x+2) 
y' = 




_ dx/dt = 



dy/dt = 

Figure 9.8: Phase plane for system with many nodes. 



dx/dt = 



dx/dt = 
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9.6.3 Lyapunov functions 

For x G WL N , tGR 1 ,/: M, N — ► M N Consider the system of differential equations 

dx 

—jr = fn(xi,x 2 ,---,x N ), n = l,2,---,N, (9.257) 

at 

with x n = as a critical point. If there exists a V(xi,x 2 , ■ ■ ■ , ccjv) : K w — > R 1 such that 

• V > for x n ^ 0, 

• V = for x n = 0, 

• dV/dt < for x n 7^ 0, and 

• dV/dt = for x n = 0, 

then the equilibrium point of the differential equations, Xi = 0, is globally stable to all per- 
turbations, large or small. The function V(xi,x 2 , • • • , Xn) is called a Lyapuno\o function. 

Although one cannot always find a Lyapunov function for a given system of differential 
equations, we can pose a method to seek a Lyapunov function given a set of autonomous 
ordinary differential equations. While the method lacks robustness, it is always straight- 
forward to guess a functional form for a Lyapunov function and test whether or not the 
proposed function satisfies the criteria: 

1. Choose a test function V{x\, ■ ■ -,Xn)- The function should be chosen to be strictly 
positive for x n 7^ and zero for x n = 0. 



2. Calculate 

dV dV dx\ dV dx 2 dV dxiy 

= i _| 1 _|_ . . . _| 11 

dt dxi dt dx 2 dt dx^ dt 



(9.258) 



dV dV . . . dV . . . dV , . . 

-rr = -^—fi{xi,---,x N ) + -—f 2 {x ir --,x N )^ h - — Jn{xi, • • • ,x N ). 

dt 0x1 ox 2 ox N 

(9.259) 

It is this step where the differential equations actually enter into the calculation. 

3. Determine if for the proposed V(x\, • • • , xn) whether or not dV/dt < 0, x n ^ 0; dV/dt = 
0,x n = 0. If so, then it is a Lyapunov function. If not, there may or may not be a 
Lyapunov function for the system; one can guess a new functional form and test again. 



I 

Example 9.14 

Show that x = is globally stable, if 

d 2 x „dx 



m^ + (3^f + kix + k 2 x 3 = 0, where m, (3, ki,k 2 > 0. (9.260) 

dt J dt 



3 Alexandr Mikhailovich Lyapunov, 1857-1918, Russian mathematician. 
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This system models the motion of a mass-spring-damper system when the spring is non-linear. 
Breaking the original second order differential equation into two first order equations, we get 

dx 

* = ^ (9 ' 261) 

^ = -£„- *„-*»*>. (9.262) 

at m m m 

Here x represents the position, and y represents the velocity. Let us guess that the Lyapunov function 
has the form 

V(x, y) = ax 2 + by 2 + ex 4 , where a,b,c> 0. (9.263) 

Note that V(x,y) > and that V(0,0) = 0. Then 

dV_ dV_<ix_ dV_dy_ 

~db ~ dx~d^ + ^7dP ( ' 

dx , dx , dy , 

= 2ax— + 4cx 3 — + 26y-^, 9.265 

dt dt dt 

= (2ax + 4cx 3 )y + 26y ( -— y -x -x 3 ) , (9.266) 

\ m m m J 

bki \ „ /„ bk2 \ 3 26 2 



= 2[a -)xy + 2[2c )x 6 y fiy. (9.267) 

\ m J \ m J m 

If we choose b = m/2, a = l/2fci, c = ^2/4, then the coefficients on xy and x 3 y in the expression for 
dV/dt are identically zero, and we get 

^ = -/?2/ 2 , (9-268) 

which for (3 > is negative for all j/ 7^ and zero for y = 0. Further, with these choices of a, b, c, the 
Lyapunov function itself is 

V = -k lX 2 + -k 2 x A + -my 2 > 0. (9.269) 

Checking, we see 

dV dx , *,dx dy ,„ , , 

^ = k ^ + k ^ + my I' (9 - 270) 

9 / ft k\ k% o\ 

= k\xy + k-ix y + my y x x , (9.271) 

\ m m m J 

= k 1 xy + k 2 x 3 y-/3y 2 -k 1 xy-k2X 3 y, (9.272) 

= -fty 2 < 0. (9.273) 

Thus, V is a Lyapunov function, and x = y = is globally stable. Actually, in this case, V = (kinetic 
energy + potential energy), where kinetic energy = (l/2)my 2 , and potential energy = (l/2)fcix 2 + 
(l/4)k2X i . Note that V(x,y) is just an algebraic function of the system's state variables. When we 
take the time derivative of V, we are forced to invoke our original system, which defines the differential 
equations. We note for this system that precisely since V is strictly positive or zero for all x,y, and 
moreover that it is decaying for all time, that this necessarily implies that V — > 0, hence x, y — > 0. 

I 
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9.6.4 Hamiltonian systems 

Closely related to the Lyapunov function of a system is the Hamiltonian, which exists for 
systems which are non-dissipative, that is those systems for which dV/dt = 0. In such a case 
we define the Hamiltonian H to be the Lyapunov function H = V with dH/dt = 0. For 
such systems, we integrate once to find that H(xi,yi) must be a constant for all Xi,yi. Such 
systems are said to be conservative. 

With x e R N ,y e R N ,t e R 1 ,/ : R 2N -»• R N ,g : R 2N -> R N We say a system of 
equations of the form 

f n {xi,---,x N ,y ir --,y N ), — ^ = g n [x 1 ,---,x N ,y 1 ,---,y N ), n=l,---,N, 



dt dt 

(9.274) 

is Hamiltonian if we can find a function H(x n , y n ) : R N x R N — > M 1 such that 

diJ (9i/ rfx n <9if dy n 
dt dx n dt dy n dt 

dH dH . . . dH . . n nn „. 

~dt = dx~ ^ Xl ' " ' ' XiV ' yi ' " ' ' VN ' + Q~ 9n ^ Xl ' ■"i x N,yi,---,yN) = 0. (9.276) 

This differential equation can at times be solved directly by the method of separation of 
variables in which we assume a specific functional form for H(xi,yi). 
Alternatively, we can also determine H by demanding that 

dH_ = dXn dH_ = _dyn , g 2? ^ 

dy n dt ' dx n dt 

Substituting from the original differential equations, we are led to equations for H{x^yi) 

-g— = fi{xi,---,x N ,yi,---,y N ), — = -g i (x 1 ,---,x N ,y 1 ,---,y N ). (9.278) 



I 

Example 9.15 

Find the Hamiltonian for a linear mass spring system: 



d 2 x dx , 

= i . (9.279) 

o 



m— - + kx = 0, x(0) = x , 

dt z dt 



Taking dx/ dt = y to reduce this to a system of two first order equations, we have 

dx 

— = f(x,y) = y, x(0) = x Ol (9.280) 

dt 

% = g(x,y) = —x, 2/(0) =2/0- (9-281) 

dt m 

For this system N = 1. 
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We seek H(x, y) such that dH/dt = 0. That is 

dH dH dx dH dy , 

= H = 0. 9.282 

dt dx dt dy dt K ' 

Substituting from the given system of differential equations we have 

dH dH ( k \ , 

dx dy \ m J 

As with all partial differential equations, one has to transform to a system of ordinary equations in 
order to solve. Here we will take the approach of the method of separation of variables and assume a 
solution of the form 

H(x,y)=A(x)+B(y), (9.284) 

where A and B are functions to be determined. With this assumption, we get 

dA k dB , 

y- *— =0. (9.285) 

ax m dy 

Rearranging, we get 

IdA k dB 

-—=— — . (9.286) 

x dx my ay 

Now the term on the left is a function of x only, and the term on the right is a function of y only. The 
only way this can be generally valid is if both terms are equal to the same constant, which we take to 
be C. Hence, 

^ = ±f = C , (9.287) 

x ax my ay 

from which we get two ordinary differential equations: 

dA „ dB Cm 

— =Cx, — = — y. (9.288) 

dx dy k 

A(x) = hjx 2 + Ki, B(y) = i^y 2 + K 2 . (9.289) 

H(x,y) = ^c(x 2 + jy 2 ^+K 1 + K 2 . (9.290) 

While this general solution is perfectly valid, we can obtain a common physical interpretation by taking 
C = k, K\ + Ki = 0. With these choices, the Hamiltonian becomes 

H(x,y) = hx 2 + ^my 2 . (9.291) 

The first term represents the potential energy of the spring, the second term represents the kinetic 
energy. Since by definition dH/dt = 0, this system conserves its mechanical energy. Verifying the 
properties of a Hamiltonian, we see 

(9.292) 

(9.293) 
(9.294) 
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The solution is 



A general solution is 



dH 
~dt 


dH dx dH dy 
dx dt dy dt ' 




( k 

= kxy + my x 

V m 
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Since this system has dH/dt = 0, then H(x, y) must be constant for all time, including t = 0, when the 
initial conditions apply. So 

H(x(t), y(t)) = H(x(0), y(0)) = \ (kx 2 + my 2 ) . (9.295) 

Thus, the system has the integral 

- (kx 2 + my 2 ) = - (kx 2 + my 2 ) . (9.296) 

We can take an alternate solution approach by consideration of Eq. (|9.278[) as applied to this 
problem: 

dH „ dH k , 

lr = f = y, — = -g=-x. (9.297) 

ay ox m 

Integrating the first of these, we get 

H(x,y) = ^y 2 + F(x). (9.298) 

Differentiating with respect to i, we get 

(9.299) 

ux ax 

and this must be 



So 



Thus, 



OH 


dF 


dx 


dx 


dF 

dx 


k 

= — X 

m 


(x) = 


x 2 ■ 

2m 



(9.300) 

(9.301) 

H(x, y) = (kx 2 + my 2 ) + K. (9.302) 

2m 

We can choose K = 0, and since dH/dt = 0, we have H as a constant which is set by the initial 
conditions, thus giving 

-L (kx 2 + my 2 ) = -L (kx 2 + my 2 ) , (9.303) 

2m 2m 

which gives identical information as does Eq. (|9.296[) . 



9.7 Differential-algebraic systems 

Many dynamic systems are better considered as differential-algebraic systems of equations 
of the general form given in Eq. (19.48)) . There is a rich theory on such systems, which we 
will not be able to fully exploit here. Instead, we shall consider briefly certain types of linear 
and non-linear differential-algebraic systems. 
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9.7.1 Linear homogeneous 

Consider the system of homogeneous differential-algebraic equations of the form 

r/x 
B — = A x. (9.304) 

dt 

Here A and B are constant matrices, and we take B to be singular; thus, it cannot be 
inverted. We will assume A is invertible. There is an apparent equilibrium when x = 0, but 
the singularity of B gives us concern that this may not always hold. In any case, we can 
assume solutions of the type x = ee xt and substitute into Eq. (I9.304P to get 

B-eAe A< = A • ee xt , (9.305) 

B-eA = Ae, (9.306) 

(A-AB)-e = 0. (9.307) 

Eq. (I9.307P is a generalized eigenvalue problem in the second sense, as considered in Sec. 18.3.21 



I 

Example 9.16 

Solve the linear homogeneous differential- algebraic system 

(9.308) 

(9.309) 

While this problem is simple enough to directly eliminate x 2 in favor of Xi, other problems are not that 
simple, so let us illustrate the general method. In matrix form, we can say 



dx-\ dx 2 
— -+2— -- 
dt dt 


= £1+2:2, 


= 


= 2x\ — X2 



1 2\ (*§t\ = (l 1 \ (X! 

0j-\^J v 2 -l) \x 2 



Taking x\ = e\e and x 2 = e 2 e gives 



(9.310) 



A io ;-;r = a -1 r U 1 " • (9 - 311) 



1 2 \ I e i l .At l 1 1 \ I ei \ x t 



A 2A\ (ei\ (1 1 \ (ei 

0)\e 2 ) \2 -l)'{e 2 

1-A 1-2A\ / eA /0 

2 -I r\e 2 U 



(9.312) 
(9.313) 



The determinant of the coefficient matrix must be zero, giving 

- (1- A)-2(1-2A) = 0, (9.314) 

-1 + A-2 + 4A = 0, (9.315) 

A = -. (9.316) 

5 

With this generalized eigenvalue, our generalized eigenvectors in the second sense are found via 

V '--/"Ms) - (!)■ <«"> 

1 IMS) - (:)■ 
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By inspection, the non-unique solution must be of the form 



l) =c ^W 



So the general solution is 

'xA ( de 3 '/ 5 



x 2 J = {2C ie ^'- 

There is only one arbitrary constant for this system. 

A less desirable approach to differential algebraic systems is to differentiate the constraint. This 
requires care in that an initial condition must be imposed which is consistent with the original constraint. 
Applying this method to our example problem gives rise to the system 

X!+x 2 , (9.321) 

0. (9.322) 



In matrix form, this gives 





dxi dx 2 

— - + 2 — - 
dt dt 




dxi dx 2 
~~dt dT 




1 

2 


2 \ f^r\ 




/ dxi \ 

1 dx-, J 
\ dt / 



1 1\ I x x 



0/ V*,' < 9 - 323 > 



l l 



5 5 



IV- 



The eigenvectors of the coefficient matrix are A = and A = 3/5. Whenever one finds an eigen- 
value of zero in a dynamic system, there is actually a hidden algebraic constraint within the system. 
Diagonalization allows us to write the system as 



(9.325) 



i) = (4fHto)(4lMS 



Regrouping, we can say 



— {x 1 +x 2 ) = -(ai+sa), (9-327) 

dt 5 

' 2 Xl +x 2 ) = 0. (9.328) 



dt 



Solving gives 



Xl +x 2 = Cie 3t/5 , (9.329) 

-2xi +x 2 = C 2 . (9.330) 

So the problem with the differentiated constraint yields two arbitrary constants. For consistency with 
the original formulation, we must take C 2 = 0, thus x 2 = 2x\. Thus, 

Xl = ide 3 '/ 5 , (9.331) 

x 2 = ^Cie 3 '/ 5 . (9.332) 

Because C\ is arbitrary, this is fully consistent with our previous solution. 

I 
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9.7.2 Non-linear 

Let us consider two simple non-linear examples for differential-algebraic equation systems. 



I 

Example 9.17 

Solve 



dx 





dt 


— 


-y, 


x 2 


+ y 2 


= 


i, 




1(0) 


= 


0.99 



(9.333) 

(9.334) 
(9.335) 



The system is non-linear because of the non-linear constraint. However, we can also view this 
system as a Hamiltonian system for a linear oscillator. The non-linear constraint is the Hamiltonian. 
We recognize that if we differentiate the non-linear constraint, the system of non- linear differential 
algebraic equations reduces to a linear system of differential equations, dx/dt = —y, dy/dt = x, which 
is that of a linear oscillator. 

Formulated as a differential- algebraic system, we can say 



1 




-y 

,,2 



x(0) = 0.99. 



(9.336) 



We might imagine an equilibrium to be located at (x, y) = (±1, 0). Certainly at such a point dx/dt = 0, 
and the constraint is satisfied. However, at such a point, dy/dt ^ 0, so it is not a true equilibrium. 
Linearization near (±1,0) would induce another generalized eigenvalue problem in the second sense. 
For the full problem, the form presented is suitable for numerical integration by many appropriate 
differential-algebraic software packages. We do so and find the result plotted in Fig. 19.91 For this 






Figure 9.9: Solution to the differential- algebraic system of Eq. (I9.336p . 

system, what is seen to be a pseudo-equilibrium at (x,y) = (±1,0) is realized periodically. The point 
is not a formal equilibrium, since it does not remain there as t — > oo. We also clearly see that the 
trajectory in the (x, y) plane is confined to the unit circle, as required by the constraint. 

I 



I 

Example 9.18 

Solve 



dx 

~di 



V +xy, 



(9.337) 
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2x 2 + y 2 
x(0) 



1. 
0. 



(9.338) 
(9.339) 



Formulated as a differential- algebraic system, we can say 



1 




u i I dt 



y 

2x 2 - 



xy 
■J 2 -I 



x(0) =0. 



(9.340) 



We could linearize near the potential equilibria, located at (x, y) = (±l/v3, =Fl/v3), (±-v/l/2, 0). This 
would induce another generalized eigenvalue problem in the second sense. For the full problem, the 
form presented is suitable for numerical integration by many appropriate differential-algebraic software 
packages. We do so and find the result plotted in Fig. 19.101 For this system, a true equilibrium at 




10 





Figure 9.10: Solution to the differential-algebraic system of Eq. (j9.340p . 

(x, y) = (l/v3, — 1/V3) is realized. We also clearly see that the trajectory in the (x, y) plane is confined 
to the ellipse, as required by the constraint. 



J 



9.8 Fixed points at infinity 

Often in dynamic systems there are additional fixed points, not readily seen in finite phase 
space. These fixed points are actually at infinity, and such points can play a role in deter- 
mining the dynamics of a system as well as aiding in finding basins of attraction. Fixed 
points at infinity can be studied in a variety of ways. One method involves the so-called 
Poincare sphere. Another method uses what is called projective space. 

9.8.1 Poincare sphere 

For two-dimensional dynamic systems, a good way is to transform the doubly-infinite phase 
plane onto the surface of a sphere with radius unity. The projection will be such that points 
at infinity are mapped onto the equator of the sphere. One can then view the sphere from 
the north pole and see more clearly how the dynamics develop on the surface of the sphere. 
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I 

Example 9.19 

Using the Poincare sphere, find the global dynamics, including at infinity, for the simple system 

dx 

— = x, (9.341) 

I = -»■ < M42 > 

Obviously the equilibrium point is at (x, y) = (0, 0), and that point is a saddle node. Let us project 
the two state variables x and y into a three-dimensional space by the mapping M. 2 — > K 3 : 

X = I, X 2 2 ' ( 9 - 343 ) 

V'l+i +J/ 

Y = L (9.344) 

y/1 + x 2 + y 2 

Z = \ (9.345) 

a/1 + x 2 + y 2 

We actually could alternatively analyze this system with a closely related mapping from M 2 — > R 2 , but 
this makes some of the analysis less geometrically transparent. 
Note that 

lim X = 1 Vy < oo, (9.346) 



:i-—>OC 



lim Y = 1 Vx < oo. (9.347) 

y— >oo 

Note further if both x and y go to infinity, say on the line y = mx, then 

lim X = -7=4=p (9-348) 

Tfl 

lim y = ^==p (9.349) 

lim X 2 +Y 2 = 1. (9.350) 

x — >oo,y— mx 

So points at infinity are mapping onto a unit circle in (X, Y) space. Also, going into the saddle node 
at (x, y) = (0,0) along the same line gives 

lim X = x + ..., (9.351) 

x—>0,y=mx 

lim Y = y+ .... (9.352) 

x — >0,y— rax 

So the original and transformed space have the same essential behavior near the finite equilibrium point. 
Last, note that 



x 2 + y 2 + 1 
1 + x 2 + y 2 



X 2 + r 2 + Z 2 = ; ^l 1 =1- (9.353) 



Thus, in fact, the mapping takes one onto a unit sphere in (X, Y, Z) space. The surface X 2 + Y 2 + Z 2 = 1 
is called the Poincare sphere. One can actually view this in the same way one does an actual map of 
the surface of the Earth. Just as a Mercatoi|f| projection map is a representation of the spherical surface 



IGera dus Me rcator] 1512-1594, Flemish cartographer. 
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of the earth projected onto a flat surface (and vice versa), the original (x,y) phase space is a planar 
representation of the surface of the Poincare sphere. 

Let us find the inverse transformation. By inspection, it is seen that 

x = J, (9.354) 

y = |. (9.355) 

Now apply the transformation, Eqs. (|9. 35419. 355P to our dynamical system, Eqs. (|9.341I9.342|) : 

= f , (9.356) 

X 

= "§■ (9-357) 

dy/dt ~ y 

Expand using the quotient rule to get 

1 dX X dZ X , 

-z^-^--t = z' (9 - 358) 




1 dY Y dZ Y , 

-z^~r--t = -z- (9 - 359) 

Now on the unit sphere X 2 + Y 2 + Z 2 = 1, we must have 

2XdX + 2YdY + 2ZdZ = 0, (9.360) 

so dividing by dt and solving for dZ/dt, we must have 

dZ _ XdX Y dY 

dF""z^"z^' (9 ' 361) 

Using Eq. (|9.361|) to eliminate dZ/dt in Eqs. (|9.358I9.359|) . our dynamical system can be written as 

(9.362) 



1 dX X f XdX Y dY\ X 

z~M~~z 2 \~z~d7~~z~d7) ~z' 



i Eqs. (C 1). 


(- 


XdX 


Y dY\ 

~z~d7) 


~~Z~dT~ 




dZ/dt 


(" 


XdX 


Y dY\ 

~z~d7) 


~~Z~dT~ 



1 dY Y ( XdX Y dY\ Y 

~Z~dt ~ Z2 \ Z dt Z dt I Z~ 



(9.363) 



dZ/dt 

Multiply Eqs. (|9.362I9.363|) by Z 3 to get 



Regroup to find 



,dX ( dX dY\ , 

z ir + x { x iT + Y ir) = zx > ( 9 - 364) 

Z ^ + Y (x— + Y^-) = -Z 2 Y (9.365) 

dt V dt dt J y ' 



(X 2 + Z 2 )—+XY— = Z 2 X, (9.366) 

dt dt 

XY— + (Y 2 + Z 2 )— = -Z 2 Y. (9.367) 

dt dt 
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Now, eliminate Z by 


demanding X 2 + Y 2 + Z 2 = 1 to 


get 




(1- 


' dt dt v 


1-X 2 - 




-£+<- 


^ ~ - 


-(1-X 2 


Solve this quasi-linear system for dX/dt 


and dY / dt to 


get 






dX 
~dt 


= X-X 3 + XY 2 7 






dY 
~dt 


= -Y + Y 3 


-X 2 Y. 


The five equilibrium 


points, 


and their stability, for this 


; system a 






(XX) 


= (0,0), 


saddle, 






(X,Y) 


= (1,0), 


sink, 






(X,Y) 


= (-1,0), 


sink, 






(X,Y) 


= (0,1), 


source, 






(X,Y) 


= (0,-1), 


source 



Y 2 )X, (9.368) 

- Y 2 )Y. (9.369) 

(9.370) 
(9.371) 



(9.372) 
(9.373) 
(9.374) 
(9.375) 
(9.376) 

Note that in this space, four new equilibria have appeared. As we are also confined to the Poincare 
sphere on which X 2 + Y 2 + Z 2 = 1, we can also see that each of the new equilibria has Z = 0; that is, 
the new equilibrium points lie on the equator of the Poincare sphere. Transforming back to the original 
space, we find the equilibria are at 

(x,y) = (0,0), saddle, (9.377) 

(x,y) = (oo,0), sink, (9.378) 

(x,y) = (-oo,0), sink, (9.379) 

(x,y) = (0,oo), source, (9.380) 

(x,y) = (0, — oo), source. (9.381) 

Phase portraits showing several trajectories projected into (X, Y) and (X, Y, Z) space are shown 
in Fig. 19.111 Fig. 19.11b represents the Poincare sphere from above the north pole; Fig. 19.11b depicts 
the entire Poincare sphere. On the sphere itself there are some additional complexities due to so-called 
anti-podal equilibrium points. In this example, both the north pole and the south pole are saddle 
equilibria, when the entire sphere is considered. For more general problems, one must realize that 
this projection induces pairs of equilibria, and that usually only one member of the pairs needs to be 
considered in detail. 

Additionally, one notes in the global phase portraits two interesting features for two-dimensional 
phase spaces: 

• except at critical points, individual trajectories never cross each other, 

• all trajectories connect one critical point to another, and 

• it formally takes an infinite amount of time to reach a critical point. 

Any trajectory can also be shown to be a so-called invariant manifold. An invariant manifold is a 
set of points with the special property that if any one of them is used as an initial condition for the 
dynamic system, the time-evolution due to the dynamic system restricts the system to the invariant 
manifold. Certain of these manifolds are so-called slow invariant manifolds in that nearby trajectories 
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1.0 source 




-11 source 



Figure 9.11: Global phase portraits of the system dx/dt = x, dy/dt = —y: a) projection 
from the Poincare sphere onto the (X, Y) plane, b) full projection onto the Poincare sphere 
in (X, Y, Z) space. 



are attracted to them. The line Y = 0, and so y = 0, represents a slow invariant manifold for this 
system. Note that a finite initial condition can only approach two fixed points at infinity. But the curve 
representing points at infinity, Z = 0, is an invariant manifold. Except for trajectories that originate 
at the two source points, a point at infinity must remain at infinity. 

I 



9.8.2 Projective space 

When extended to higher dimension, the Poincare sphere approach becomes lengthy. A more 
efficient approach is provided by projective space. This approach does not have the graphical 
appeal of the Poincare sphere. 



I 

Example 9.20 

Using projective space, find the global dynamics, including at infinity, for the same simple system 

dx 
~dl 
dy 
dt 



-!)■ 



(9.382) 
(9.383) 



Again, it is obvious that the equilibrium point is at (x, y) = (0,0), and that point is a saddle 
node. Let us project the two state variables x and y into a new two-dimensional space by the mapping 



1)2. 



X 
Y 



x 

I) 



(9.384) 
(9.385) 
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Note along the line y = mx, as x — > oo, we get X — > 0, Y — > m. So for i/O, a point at infinity in 
(x, y) space maps to a finite point in (X, Y") space. By inspection, the inverse mapping is 



1 

x = X> 

Y 

V = x- 

Under this transformation, Eqs. (|9. 382119. 383[) become 

-(-)=- 
dt \X J X 1 

d (Y\ Y 

It \x) ~X' 



Expanding, we find 



1 dX 1 

~ X^~dt ~ X' 
1 dY Y dX Y 

~X~dt ~ X~z~dt ~ ~x' 



Simplifying gives 



dX 

~~dt 
dY dX 



X, 



X Y = -XY. 

dt dt 



Solving for the derivatives, the system reduces to 

dX 

~dt ~ 
dY 

~dt ~ 



X, 

-2Y. 



9.386) 
9.387) 

9.388) 
9.389) 

9.390) 
9.391) 

9.392) 
9.393) 

9.394) 
9.395) 



By inspection, there is a sink at (X, Y) = (0, 0). At such a point, the inverse mapping tells us x — > ±oo 
depending on whether X is positive or negative, and y is indeterminate. If we approach (X, Y) = (0, 0) 
along the line Y = mX , then y approaches the finite number m. This is consistent with trajectories 
being swept away from the origin towards x — > ±oo in the original phase space, indicating an attraction 
at x — > ±oo. But it does not account for the trajectories emanating from y — > ±oo. This is because 
the transformation selected obscured this root. 

To recover it, we can consider the alternate transformation X = x/y, Y = 1/y. Doing so leads to 
the system dX/dt = 2X, dY /dt = Y, which has a source at (X,Y) = (0,0), which is consistent with 
the source-like behavior in the original x, y space as y — > ±oo. This transformation, however, obscures 
the sink like behavior at x — > ±oo. 

To capture both points at infinity, we can consider a non-degenerate transformation, of which there 
are infinitely many. One is X = l/(x + y), Y = [x — y)/(x + y). Doing so leads to the system 
dX/dt = —XY, dY / dt = 1 — Y 2 . This system has two roots, a source at (X,Y) = (0,-1) and a sink 
at (X, Y) = (0, 1). The source corresponds to y — > ±oo. The sink corresponds to x — > ±oo. 

I 
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9.9 Fractals 

In the discussion on attractors in Section I9.6.1[ we included geometrical shapes called frac- 
tals. These are objects that are not smooth, but occur frequently in the dynamical systems 
literature either as attractors or as boundaries of basins of attractions. 

A fractal can be defined as a geometrical shape in which the parts are in some way similar 
to the whole. This self-similarity may be exact, i.e. a piece of the fractal, if magnified, may 
look exactly like the whole fractal. Before discussing examples we need to put forward 
a working definition of dimension. Though there are many definitions in current use, we 
present here the Hausdorff-Besicovitclo dimension D. If N e is the number of 'boxes' of side 
length e needed to cover an object, then 

D = lim - n * . (9.396) 

e^o ln(l/e) v ; 

We can check that this definition corresponds to the common geometrical shapes. 

1. Point: N e = l,D = since D = finite z^ = 0, 

2. Line of length I: N e = l/e, D = 1 since D = lim e ^ ¥^7 = li z^r = X > 

3. Surface of size I 2 : N e = {l/e) 2 , D = 2 since D = lim e ^ ^^ = 21n ^ 2 e lne = 2, 

4. Volume of size / 3 : N e = (7/e) 3 , D = 3 since D = lim e ^ lE ^ff = 31n _'7 n 3 £ ln£ = 3. 

A fractal has a dimension that is not an integer. Many physical objects are fractal-like, in 
that they are fractal within a range of length scales. Coastlines are among the geographical 
features that are of this shape. If there are N e units of a measuring stick of length e, the 
measured length of the coastline will be of the power-law form eN e = e 1-13 , where D is the 
dimension. 

9.9.1 Cantor set 

Consider the line corresponding to k = in Fig. 19.121 Take away the middle third to leave 

k=0 

k=1 

k=2 

Figure 9.12: Cantor set. 



5 after Felix Hausdorff, 1868-1942, German mathematician, and Abram Samoilovitch Besicovitch, 1891- 
1970, Russian mathematician. 
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the two portions; this is shown as k = 1. Repeat the process to get k = 2, 3, . . .. If k — ► oo, 
what is left is called the Cantoio set. Let us take the length of the line segment to be unity 
when k = 0. Since N £ = 2 k and e = l/3 fc , the dimension of the Cantor set is 

lniv e , ln2 fc Hn2 In 2 

D = lim - ,\ = lim T = = = 0.6309 .... 9.397 

e-oln(l/e) fc-ooln3 fc k In 3 In 3 v ' 

It can be seen that the endpoints of the removed intervals are never removed; it can be 
shown the Cantor set contains an infinite number of points, and it is an uncountable set. It 
is totally disconnected and has a Lebesgue measure zero. 

9.9.2 Koch curve 

Here we start with an equilateral triangle shown in Fig. 19.131 as k = 0. Each side of the 





Figure 9.13: Koch curve. 

original triangle has unit length. The middle third of each side of the triangle is removed, 
and two sides of a triangle drawn on that. This is shown as k = 1. The process is continued, 
and in the limit gives a continuous, closed curve that is nowhere smooth. Since N e = 3 x 4 k 
and e = l/3 fc , the dimension of the Kocl{j curve is 

r* , lnN e , ln(3)4 fc , in3 + Hn4 In 4 _ 

D = lim , . . e . = lim , ' = lim — — = — = 1.261 .... 9.398 

e-oln(l/e) fc^oo ln3 fc fc-oo k In 3 In 3 v ' 

The limit curve itself has infinite length, it is nowhere differentiable, and it surrounds a finite 
area. 

9.9.3 Menger sponge 

An example of a fractal which is an iterate of an object which starts in three-dimensional 
space is a "Menger sponge. "|f| A Menger sponge is depicted in Fig. 19.141 



e Georg Ferdinand Ludwig Philipp Cantor, 1845-1918, Russian-born, German-based mathematician. 
'Niels Fabian Helge von Koch, 1870-1924, Swedish mathematician. 



s Karl Menger, 1902-1985, Austrian-born mathematician and member of the influential 'Vien na Circlel " 



He served on the faculties of the Universities of Amsterdam, Vienna, Notre Dame, and the Illinois Institute 
of Technology. 
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Figure 9.14: Menger sponge. 

9.9.4 Weierstrass function 

For a, b, t e R 1 , W : M 1 -> R 1 , the Weierstrass@ function 

oo 

W{t) = ^Vcos6 fc t, 

k=l 



(9.399) 



where a is real, b is odd, and ab > 1 + 37r/2. It is everywhere continuous, but nowhere differ- 
entiable! Both require some effort to prove. A Weierstrass function is plotted in Fig. l9T5l Its 




-1.0 l 



Figure 9.15: Four term (k = 1, . . . ,4) approximation to the Weierstrass function, W(t) for 
b = 13, a = 1/2. 

fractal character can be seen when one recognizes that cosine waves of ever higher frequency 
are superposed onto low frequency cosine waves. 



9.9.5 Mandelbrot and Julia sets 



For z £ C 1 , c £ C 1 , the Mandelbrot! 10 l set is the set of all c for which 



Zk+l = z k + c, 



(9.400) 



s Karl Theodor Wilhelm Weierstrass, 1815-1897, Westphalia-born German mathematician. 
lc Benoit Mandelbrot, 1924-2010, Polish-born mathematician based mainly in France. 
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stays bounded as k — > oo, when zq 
set is sketched in Fig. 19.161 



0. The boundaries of this set are fractal. A Mandelbrot 




Figure 9.16: Mandelbrot set. Black regions stay bounded; colored regions become unbounded 
with shade indicating how rapidly the system becomes unbounded. Image generated from 



http : //cs . clarku . edu/^dj oyce/ julia/explorer . rrtml| 



Associated with each c for the Mandelbrot set is a Julial^J set. In this case, the Julia set 
is the set of complex initial seeds Zq which allow z^+i = z\ + c to converge for fixed complex 
c. A Julia set for c = 0.49 + 0.57i is plotted in Fig. I97P71 




Figure 9.17: Julia set for c = 0.49+0.57z. Black regions stay bounded; colored regions become 
unbounded with shade of color indicating how rapidly the system becomes unbounded. Image 



generated from http://cs.clarku.edu/~djoyce/julia/explorer.html 



9.10 Bifurcations 

Dynamical systems representing some physical problem frequently have parameters associ- 
ated with them. Thus, for x e R N , t E R 1 , r e R 1 , / : R N -»■ 1 



CLXji 



fn{xi,x 2 ,- ■ ■ ,x N ;r) (n 



, we can write 
N), 



(9.401) 



11 Gaston Maurice Julia, 1893-1978, Algerian-born French mathematician. 
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where r is a parameter. The theory can easily be extended if there is more than one param- 
eter. 

We would like to consider the changes in the behavior of t — > oo solutions as the real 
number r, called the bifurcation parameter, is varied. The nature of the critical point may 
change as the parameter r is varied; other critical points may appear or disappear, or its 
stability may change. This is a bifurcation, and the r at which it happens is the bifurcation 
point. The study of the solutions and bifurcations of the steady state falls under singularity 
theory. 

Let us look at some of the bifurcations obtained for different vector fields. Some of the 
examples will be one-dimensional, i.e. x G R^r G R 1 , / : R 1 — > R . 

dx 

— = f{x-r). (9.402) 

Even though this can be solved exactly in most cases, we will assume that such a solution 
is not available so that the techniques of analysis can be developed for more complicated 
systems. For a coefficient matrix that is a scalar, the eigenvalue is the coefficient itself. The 
eigenvalue will be real and will cross the imaginary axis of the complex plane through the 
origin as r is changed. This is called a simple bifurcation. 

9.10.1 Pitchfork bifurcation 

For x G R\t G R\r G R\r G R 1 , consider 

doc 

— = - x ( x * - ( r _ ro )). (9.403) 

The critical points are x = 0, and ±-y/r — ro. r = ro is a bifurcation point; for r < ro there 
is only one critical point, while for r > ro there are three. 
Linearizing around the critical point x = 0, we get 

JL = ( r - r Q )x. (9.404) 

at 

This has solution 

x(t) = x(0) exp ((r - r )t) . (9.405) 

For r < r , the critical point is asymptotically stable; for r > r it is unstable. 

Notice that the function V(x) = x 2 satisfies the following conditions: V > for x ^ 0, 
V = for x = 0, and dV/tft = {dV / dx) (dx / dt) = -2x 2 (x 2 - (r - r )) < for r < r . Thus, 
V(x) is a Lyapunov function and x = is globally stable for all perturbations, large or small, 
as long as r < r . 

Now let us examine the critical point x = \/r — r which exists only for r > r . Putting 
x = x + x, the right side of Eq. (I9.403J) becomes 

f(x) = -(Vr-r + x) ((\/r-r + x) 2 -(r-r )) . (9.406) 

ICC BY-NC-THJ} 29 July 2012, Sen & Powers. 



9.10. BIFURCATIONS 



457 



Linearizing for small x, we get 



This has solution 



dx . . _ 

- = -2(r - r )x. 



(9.407) 



£(t) = 5(0) exp (-2(r - r )t) . (9.408) 

For r > r , this critical point is stable. The other critical point x = — \Jr — r$ is also found 
to be stable for r > r . The results are summarized in the bifurcation diagram sketched in 
Figure [9.181 At the bifurcation point, r = r , we have 




Figure 9.18: Sketch of a pitchfork bifurcation. Heavy lines are stable equilibria; dashed lines 
are unstable equilibria. 



dx 
~dt 



-x 



(9.409) 



This equation has a critical point at x = but has no linearization. We must do a non-linear 
analysis to determine the stability of the critical point. In this case it is straightforward. 
Solving directly and applying an initial condition, we obtain 



x(t) = ± 



x(Q) 



v / l + 2x(0) 2 t' 
lim x(t) = 0. 

t— >oo 



(9.410) 
(9.411) 



Since the system approaches the critical point as t — > oo for all values of x(0), the critical 
point x = is unconditionally stable. 



9.10.2 Transcritical bifurcation 

For x e R 1 , t e R 1 , r G R\ r G R 1 , consider 



c/.r 



_ = -x(x- (r-r )). 



(9.412) 
\CC BY-NC-TW} 29 July 2012, Sen & Powers. 



458 



CHAPTER 9. DYNAMICAL SYSTEMS 



The critical points are x = and r — tq. The bifurcation occurs at r = tq. Once again the 
linear stability of the solutions can be determined. Near x = 0, the linearization is 



dx 
It 



(r - r )x, 



which has solution 



x(t) = x(0) exp ((r — r )t) 



(9.413) 



(9.414) 



So this solution is stable for r < r$. Near rr = r — 7~o, we take x = x — (r — vq). The resulting 
linearization is 

JL = -(r-r )x, (9.415) 



which has solution 



dt 



x(t) = x(0) exp ( — (r — r )t) 



(9.416) 



So this solution is stable for r > Tq. 

At the bifurcation point, r = r , there is no linearization, and the system becomes 



dx 

~dt 



which has solution 



x(t) 



x(0) 



l + x(0)t 



(9.417) 



(9.418) 



Here the asymptotic stability depends on the initial condition! For x(0) > 0, the critical 
point at x = is stable. For x(0) < 0, there is a blowup phenomena at t = — l/x(0). The 
results are summarized in the bifurcation diagram sketched in Figure 19.191 




Figure 9.19: Sketch of a transcritical bifurcation. Heavy lines are stable equilibria; dashed 
lines are unstable equilibria. 
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9.10.3 Saddle-node bifurcation 

For x e R l ,t e R l ,r e R l ,r e R 1 , consider 



dx 
~dt 



-x 2 + (r-r ). (9.419) 

The critical points are x = ±\/r — vq. Taking x = x =F \Jr — r$ and linearizing, we obtain 

dx 

— = T^V^niX, (9.420) 

which has solution 



x(t) = x(0) exp (=p-\A" — r ot) ■ 



(9.421) 



For r > r , the root x = +y/r r ^r\ ) is asymptotically stable. The root x = —\/r — r Q is 
asymptotically unstable. 

At the point, r = r , there is no linearization, and the system becomes 



dx 

~dt 



which has solution 



x[t) 



x(0) 



l + x(0)t 



(9.422) 



(9.423) 



Here the asymptotic stability again depends on the initial condition For x(0) > 0, the critical 
point at x = is stable. For x(0) < 0, there is a blowup phenomena at t = — l/x(0). The 
results are summarized in the bifurcation diagram sketched in Figure 19.201 




Figure 9.20: Sketch of saddle-node bifurcation. Heavy lines are stable equilibria; dashed 
lines are unstable equilibria. 
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9.10.4 Hopf bifurcation 

To give an example of complex eigenvalues, one must go to a two-dimensional vector field. 



I 

Example 9.21 

With x, y, t, r, r G M 1 , take 



-£ = (r-r )x-y-x{x 2 +y 2 ), (9.424) 

^ = x + ( r -r )y-y{x 2 +y 2 ). (9.425) 

at 



The origin (0,0) is a critical point. The linearized perturbation equations are 

d(x\_fr — ro — 1 \ f x 
dt\ y I \ 1 r - r J \ y 



(9.426) 



The eigenvalues A of the coefficient matrix are A = (r — ro) ± i. For r < ro, the real part is negative, 
and the origin is stable. At r = ro there is a Hopo bifurcation as the eigenvalues cross the imaginary 
axis of the complex plane as r is changed. For r > ro, a periodic orbit in the (x, y) phase plane appears. 
The linear analysis will not give the amplitude of the motion. Writing the given equation in polar 
coordinates (p, 8) yields 

g = „(r ->•„)-»>>, (9.427) 

I " L (M28 » 

This is a pitchfork bifurcation in the amplitude of the oscillation p. 



9.11 Lorenz equations 

For independent variable t G Mr, dependent variables (x, y, z) T G M 3 , and parameters o,r,b G 
K 1 , a > 0, r > 0, b > 0, the Lorenzl 13 ! equations are 

dx 

— = a(y-x), (9.429) 

— = rx — y — xz, (9.430) 

dz 

— = -bz + xy. (9.431) 
dt 

The bifurcation parameter will be taken to be r. 



^Eberhard Frederich Ferdinand Hopf, 1902-1983, Austrian-born, German mathematician. 
13 Edward Norton Lorenz, 1917-2008, American meteorologist. 
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9.11.1 Linear stability 

The critical points are obtained from 

y-x = 0, (9.432) 

rx-y-xz = 0, (9.433) 

-bz + xy = 0, (9.434) 



which gives 





(9.435) 




Note when r = 1, there is only one critical point at the origin. For more general r, a linear 
stability analysis of each of the three critical points follows. 

• x = y = z = 0. Small perturbations around this point give 



(9.436) 
-b ) \ 

The characteristic equation is 

(A + b)(\ 2 + X(a + 1) - a(r - 1)) = 0, (9.437) 

from which we get the eigenvalues 

A = -b, A = - (-(1 +a)± V / (l + ^) 2 -4a(l-r)) . (9.438) 

For < r < 1, the eigenvalues are real and negative, since (1 + a) 2 > 4a (I — r). At 
r = 1, there is a pitchfork bifurcation with one zero eigenvalue. For r > 1, the origin 
becomes unstable. 

x = y = \/b(r — 1), z = r — 1. We first note we need r > 1 for a real solution. Small 
perturbations give 

( —a a \ ( x \ 

(9.439) 
^7) y/b{r - 1) -b J \ z J 

The characteristic equation is 

A 3 + (a + b + 1)A 2 + (a + r)b\ + 2ob(r - 1) = 0. (9.440) 
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This system is difficult to fully analyze. Detailed analysis reveals of a critical value of 
r: 



a 



(9.441) 



At r = r c the characteristic equation, Eq. (I9.440p . can be factored to give the eigen- 
values 

A = -(, + »+!), A = ±J^±5, (9.442) 

V o — b — 1 

If a > b + 1, two of the eigenvalues are purely imaginary, and this corresponds to 
a Hopf bifurcation. The periodic solution which is created at this value of r can be 
shown to be unstable so that the bifurcation is subcritical. 

If r = r c and a < 6+1, one can find all real eigenvalues, including at least one positive 
eigenvalue, which tells us this is unstable. 

We also find instability if r > r c . If r > r c and a > b + 1, we can find one negative 
real eigenvalue and two complex eigenvalues with positive real parts; hence, this is 
unstable. If r > r c , and a < b+ 1, we can find three real eigenvalues, with at least one 
positive; this is unstable. 

For 1 < r < r c and a < b + 1, we find three real eigenvalues, one of which is positive; 
this is unstable. 

For stability, we can take 

1 < r < r c , and a > b + 1. (9.443) 

In this case, we can find one negative real eigenvalue and two eigenvalues (which could 
be real or complex) with negative real parts; hence, this is stable. 

• x = y = — y/b(r — 1), z = r — 1. Analysis of this critical point is essentially identical 
to that of the previous point. 

For a particular case, these results are summarized in the bifurcation diagram of Fig. 19.211 
Shown here are results when a = 10, b = 8/3. For these values, Eq. (19.441ft tells us r c = 24.74. 
Note also that a > b + 1. For real equilibria, we need r > 0. The equilibrium at the origin 
is stable for r G [0,1] and unstable for r > 1; the instability is denoted by the dashed 
line. At r = 1, there is a pitchfork bifurcation, and two new real equilibria are available. 
These are both linearly stable for r G [l,r c ]. For r G [1, 1.34562], the eigenvalues are both 
real and negative. For r G [1.134562, r c ], two of the eigenvalues become complex, but all 
three have negative real parts, so local linear stability is maintained. For r > r c , all three 
equilibria are unstable and indicated by dashed lines. As an aside, we note that because of 
non-linear effects, some initial conditions in fact yield trajectories which do not relax to a 
stable equilibrium for r < r c . It can be shown, for example, that ifx(0) = y(0) = z(0) = 1, 
that r = 24 < r c gives rise to a trajectory which never reaches either of the linearly stable 
critical points. 
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Figure 9.21: Bifurcation diagram for Lorenz equations, with a = 10, b = 8/3. 

9.11.2 Non-linear stability: center manifold projection 

This is a procedure for obtaining the non-linear behavior near an eigenvalue with zero real 
part. As an example we will look at the Lorenz system at the bifurcation point r = 1. Recall 
when r = 1, the Lorenz equations have a single equilibrium at the origin. Linearization of 
the Lorenz equations near the equilibrium point at (0, 0, 0) gives rise to a system of the form 
dx/dt = A • x, where 



-a 



1 




a 












(9.444) 



The matrix A has eigenvalues and eigenvectors 



Ai 



ei 




-(ff+1), 



e 3 



eo 



(:> 




1 



■a 



1 




(9.445) 
(9.446) 
(9.447) 



The fact that Ai = suggests that there is a local algebraic dependency between at least two 
of the state variables, and that locally, the system behaves as a differential-algebraic system, 
such as studied in Sec. 19.71 
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We use the eigenvectors as a basis to define new coordinates (u,v,w) where 

(9.448) 




1 


— o 


°\ 


/ u 


1 


1 





V 








1 / 


\ w 



This linear transformation has a Jacobian whose determinant is J = 1 + a; thus, for a > — 1, 
it is orientation-preserving. It is volume- preserving only if a = or —2. Inversion shows 
that 



x + ay 





l + a 


V 


y-x 

l + a' 


((' 


= z. 



In terms of the new variables, the derivatives are expressed as 



dx 


du dv 


— = 


- — G 


dt 


dt dt 


dy 


du dv 




— -1— 


dt 


dt dt ' 


dz 


dw 


dt 


dt' 



(9.449) 

(9.450) 
(9.451) 

(9.452) 
(9.453) 
(9.454) 



so that original non-linear Lorenz equations (l9.429H9.43ip become 



-j7-«-r + = cr(l + a)v, 9.455 

dt dt 

du dv . , . , n ,„„. 

1 = -{l + a)v - (u-av)w, (9.456) 

dt dt 

— = -bw + {u-ov)(u + v). (9.457) 

(JjL 

Solving directly for the derivatives so as to place the equations in autonomous form, we get 

— = Ou (u — av)w = XiU + non-linear terms, (9.458) 

dt l + a 

— = —(l + a)v (u — av)w = \ 2 v + non-linear terms, (9.459) 

dt l + a 

— — = — bw + (u — av) (u + v) = X3W + non-linear terms. (9.460) 

The objective of using the eigenvectors as basis vectors is to change the original system to 
diagonal form in the linear terms. Notice that the linear portion of the system is in diagonal 
form with the coefficients on each linear term as a distinct eigenvalue. Furthermore, the 
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eigenvalues A2 = — (1 + a) and A3 = —b are negative ensuring that the linear behavior 
v = e~^ 1+ °"- )t and w = e~ bt takes the solution very quickly to zero in these variables. 

It would appear then that we are only left with an equation in u(t) for large t. However, 
if we put v = w = in the right side, dv/dt and dw/dt would be zero if it were not for the 
u 2 term in dw/dt, implying that the dynamics is confined to v = w = only if we ignore 
this term. According to the center manifold theorem it is possible to find a manifold (called 
the center manifold) which is tangent to u = 0, but is not necessarily the tangent itself, to 
which the dynamics is indeed confined. 

We can get as good an approximation to the center manifold as we want by choosing new 
variables. Expanding Eq. (19 .4600 . which has the potential problem, we get 

— = -bw + u 2 - (a- l)uv - av 2 . (9.461) 

dt 

Letting 

u 2 
w = w-—, (9.462) 

so that —bw + u 2 = —bw, we can eliminate the potential problem with the derivative of w. 
In the new variables (u,v,w), the full Lorenz equations are written as 

du a , . ( u 2 \ . ,„. 

(u - av) [w + — , (9.463) 



dt 1 + a \ b , 

§ = -(H.,),-^,, -„)(« + £), (9.464) 

dw _ , . 2 2(7 . / u 2 \ . . 

— = — bw — (a — \)uv — av -\ — tw(m — av) { w -\ . (9.465) 

dt o(l + a) \ b J 

Once again, the variables v and w go to zero quickly. Formally setting them to zero, we 
recast Eqs. (I9.463H9.465J) as 

du a o , n ,„. 

— = -77- rn 3 , 9.466 

dt 6(1 + a) ' y J 

dv 1 q .„ _„. 

— = -77 -m 3 , 9.467 

dt 6(1 + a) ' v ; 

Here, dv/dt and dw/dt approach zero if u approaches zero. Now the equation for the evolution 
of u, Eq. (I9.466p . suggests that this is the case. Simply integrating Eq. (I9.466J) and applying 
an initial condition, we get 



" W - ± '"(°»V Mi + J)+ + 2^(0))f (9 ' 469) 
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which is asymptotically stable as t — > oo. So to this approximation the dynamics is confined 
to the v = w = line. The bifurcation at r = 1 is said to be supercritical. Higher order 
terms can be included to obtain improved accuracy, if necessary. 

We next focus attention on a particular case where the parameters were chosen to be 
r = 1, a = 1, and b = 8/3. Figure 15.221 gives the projection onto the (u,w) phase space of 
several solution trajectories calculated in (u, v,w) phase space for a wide variety of initial 
conditions along with the center manifold, w = w — u 2 /b = 0. It is seen that a given solution 



Lorenz equation solutions 
at bifurcation point, 
r=i, a =1,b = 8/3 
x = u-v, y = u + v, z = w 



W 



sample 
trajectories 




center manifold 
w-u 2 /b = 



Figure 9.22: Projection onto the (u, w) plane of solution trajectories (blue curves) and center 
manifold (black curve) for Lorenz equations at the bifurcation point; r = 1, a = 1, b = 8/3. 

trajectory indeed approaches the center manifold on its way to the equilibrium point at the 
origin. The center manifold approximates the solution trajectories well in the neighborhood 
of the origin. Far from the origin, not shown here, it is not an attracting manifold. 

We can gain more insight into the center manifold by transforming back into (x, y, z) 
space. Figure 19.231 shows in that space several solution trajectories, a representation of the 
surface which constitutes the center manifold, as well as a curve embedded within the center 
manifold to which trajectories are further attracted. We can in fact decompose the motion 
of a trajectory into the following regimes, for the parameters r = 1, a = 1, b = 8/3. 

• Very fast attraction to the two-dimensional center manifold, w = 0: Because b > cr + 1, 
for this case, w approaches zero faster than v approaches zero, via exponential decay 
dictated by Eqs. (I9.464[l9.465p . So on a time scale of 1/6, the trajectory first approaches 
w = 0, which means it approaches w — u 2 /b = 0. Transforming back to (x, y,z) via 
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Figure 9.23: Solution trajectories (blue curves) and center manifold (green surface and black 
curve) for Lorenz equations at the bifurcation point; r = 1, a = 1, b = 8/3. 



Eqs. (19.44 91l9".45ip . a trajectory thus approaches the surface 

1 x ay 

z 



1 + a 1 + a 



V 



/ 



3 / x + y 



(9.470) 



ct=1,6=8/3 



Fast attraction to the one- dimensional curve, v = 0: Once on the two dimensional 
manifold, the slower time scale relaxation with time constant l/(c + 1) to the curve 
given by v = occurs. When v = 0, we also have x = y, so this curve takes the 
parametric form 



x(s) 
z(s) 



s, 
s, 

1 



as 



b V 1 + cr 1 + cr 



**. 



(9.471) 
(9.472) 

(9.473) 



CT =l,6=8/3 



Slow attraction to the zero- dimensional equilibrium point at (0,0,0): This final relax- 
ation brings the system to rest. 
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For different parameters, this sequence of events is modified, as depicted in Fig. 19.241 In 





Figure 9.24: Solution trajectories (blue curves) and center manifold (green surface and black 
curve) for Lorenz equations at the bifurcation point; a) r = 1, a = 1, b = 100, and b) r = 1, 
a = 100, 6 = 8/3. 



1, b = 100. By Eqs. 



these parameters induce 
0; as before, this is followed by a fast relaxation to v = 0, 



Fig. 19.24a . we take r = 1, a = 

an even faster relaxation to w 

where x = y, and a final slow relaxation to equilibrium. One finds that the center manifold 

surface w = has less curvature and that the trajectories, following an initial nearly vertical 

descent, have sharp curvature as they relax onto the flatter center manifold, where the again 

approach equilibrium at the origin. 

In Fig. [9J24b, we take r = 1, a = 100, b = 8/3. By Eqs. (I9.46409.465p . these parameters 
induce an initial very fast relaxation to v = 0, where x = y. This is followed by a fast 
relaxation to the center manifold where w = 0, and then a slow relaxation to equilibrium at 
the origin. 



9.11.3 Transition to chaos 

By varying the bifurcation parameter r, we can predict what is called a transition to chaos. 
We illustrate this transition for two sets of parameters for the Lorenz equations. The first will 
have trajectories which relax to a stable fixed point; the second will have so-called chaotic 
trajectories which relax to what is known as a strange attractor. 
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I 

Example 9.22 

Consider the solution to the Lorenz equations for conditions: a = 10, r = 10, b = 8/3 with initial 
conditions x(0) = y(0) = z(0) = 1. 



We first note that r > 1, so we expect the origin to be unstable. Next note from Eq. (J9.441J) that 

24.74. (9.474) 



cr(cr + b + 3) _ 10(10 + f + 3) _ 470 



a-b-1 10- f - 1 19 

So we have 1 < r < r c . We also have a > 6 + 1. Thus, by Eq. (|9.443p . we expect the other equilibria to 
be stable. From Eq. (|9.435[) . the first fixed point we examine is the origin (x,y,z) = (0,0,0). We find 
its stability by solving the characteristic equation, Eq. (J9.437I) : 

(A + 6)(A 2 + A(a + l)-CT(r-l) = 0, (9.475) 

(A+-) (A 2 + llA-90) = 0. (9.476) 

Solution gives 

A=-|, A = i (-11 ± V48l) , (9.477) 

Numerically, this is A = —2.67,-16.47,5.47. Since one of the eigenvalues is positive, the origin is 
unstable. From Eq. (|9.435[) . a second fixed point is given by 



y/b(r - 1) = \j - (10 - 1) = 2^6 = 4.90, (9.478) 



y = sfb{r- I) = J -(10- 1) = 2^=4.90, (9.479) 

z = r-l = 10-l = 9. (9.480) 

Consideration of the roots of Eq. (|9.440|) shows the second fixed point is stable: 

A 3 + {a + b+ 1)A 2 + {a + r)b\ + 2ab(r - 1) = 0, (9.481) 

A 3 + — A 2 + A + 480 = 0. (9.482) 

o o 

Solution gives 

A = -12.48, A = -0.595 ±6.17 i. (9.483) 

From Eq. (|9.435[) . a third fixed point is given by 



■^b(r- 1) = -\/-(W- 1) = -2^6 = -4.90, (9.484) 



y = -yjb{r- 1) = -W-(10- 1) = -2^6 = -4.90, (9.485) 

z = r-l = 10-l = 9. (9.486) 

The stability analysis for this point is essentially identical as that for the second point. The eigenvalues 
are identical A = —12.48, —0.595+ 6.17i; thus, the root is linearly stable. Because we have two stable 
roots, we might expect some initial conditions to induce trajectories to one of the stable roots, and 
other initial conditions to induce trajectories to the other. 

Figure [9.25l shows the phase space trajectories in (x, y, z) space and the behavior in the time domain, 
x(t), y(t), z(t). Examination of the solution reveals that for this set of initial conditions, the second 
equilibrium is attained. 

I 
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Figure 9.25: Solution to Lorenz equations, a = 10, r = 10, b = 8/3. Initial conditions are 
x(0) = y(0) = z(0) = 1. 
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I 

Example 9.23 

Now consider the conditions: a = 10, r = 28, b = 8/3. Initial conditions remain x(0) = y(0) = 
z(0) = l. 

The analysis is very similar to the previous example, except that we have changed the bifurcation 
parameter r. We first note that r > 1, so we expect the origin to be an unstable equilibrium. We next 
note from Eq. (|9.441j) that 

r c = f*l±£ti) = 10 ^ + 8 f + 3 ) = 112 = 24 .74, (9.487) 

0--6-1 10- | - 1 19 ' v ' 

remains unchanged from the previous example. So we have r > r c . Thus, we expect the other equilibria 
to be unstable as well. 

From Eq. (|9.435[) . the origin is again a fixed point, and again it can be shown to be unstable. From 
Eq. (|9.435p . the second fixed point is now given by 



>/6(r - 1) = \j -(28 - 1) = 8.485, (9.488) 



y = \/b(r - 1) = J -(28 - 1) = 8.485, (9.489) 

j = r- 1 = 28- 1 = 27. (9.490) 

Now, consideration of the roots of the characteristic equation, Eq. (|9.440p . shows the second fixed point 
here is unstable: 

A 3 + {cr + b+ 1)A 2 + (er + r)bX + 2<rb{r - 1) = 0, (9.491) 

, 41 , 304 , 

A 3 H A 2 H A + 1440 = 0. (9.492) 



Solution gives 



A = -13.8546, A = 0.094 ± 10.2 i. (9.493) 



Moreover, the third fixed point is unstable in exactly the same fashion as the second. The consequence 
of this is that there is no possibility of achieving an equilibrium as t — > oo. More importantly, numerical 
solution reveals the solution to approach what is known as a strange attractor. Moreover, numerical 
experimentation would reveal an extreme, exponential sensitivity of the solution trajectories to the 
initial conditions. That is, a small change in initial conditions would induce a large deviation of a 
trajectory in a finite time. Such systems are known as chaotic. 

Figure [9.261 shows the phase space trajectory, the strange attractor, and the behavior in the time 
domain of this system which has underwent a transition to a chaotic state. 

I 
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10 15 20 25 



Figure 9.26: Phase space trajectory and time domain plots for solution to Lorenz equations, 
a = 10, r = 28, b = 8/3. Initial conditions are x(0) = y(0) = z(0) = 1. 
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Problems 

1. For the logistics equation: Xk+i = rxk(l — Xk)',0 < Xk < 1,0 < r < 4, write a short program which 
determines the value of x as k — > oo. Plot the bifurcation diagram, that is the limiting value of x as 
a function of r for < r < 4. If rj is the i" 1 bifurcation point, that is the value at which the number 
of fixed points changes, make an estimate of Feigenbaum's constant, 

c ,. Tn—1 r n 
= lim . 



2. If 



r n +i 



dx dy 

x Tt +xv Tt = s - 1 ' 

dx dy 



write the system in autonomous form, 

f = /( "' y) ' 

Plot curves on which / = 0,g = in the x,2/ phase plane. Also plot in this plane the vector field 
defined by the differential equations. With a combination of analysis and numerics, find a path in 
phase space from one critical point to another critical point. For this path, also known as heteroclinic 
or&iOi plot x(t), y(t) and include the path in the (x,y) phase plane. 

3. Show that for all initial conditions the solutions of 



di 


= —x + x y - 


-y 2 


dy 
di 


= —x 3 + xy - 


- 6z 


dz 

~di 


= 2y, 





tend to x = y = z = as t—> oo. 
Draw the bifurcation diagram of 



dx 3 



r + cc((r-3) 2 - l) 



di 

where r is the bifurcation parameter, indicating stable and unstable branches. 
A two-dimensional dynamical system expressed in polar form is 

d ± = ^-2)0* -3), 

d6 

— = 2. 

di 

Find the (a) critical point(s), (b) periodic solution(s), and (c) analyze their stability. 



1 In contrast, a homoclinic orbit starts near a critical point, travels away from the critical point, and then 
returns to the same critical point. 
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6. Find a critical point of the following system, and show its local and global stability. 

§ = {x -2){(y-lf-l), 
| = (2 - y) ((* - 2) a + l) , 

Tt = (4 " z) - 

7. Find the general solution of dx/rfi = A ■ x where 



1 


-3 


1 


2 


-1 


-2 


2 


-3 






Find the solution of dx./dt = A • x where 



and 

x(0) 

9. Find the solution of d~x./ dt = A • x where 




and 

x(0) 

10. Find the solution of d~x./dt = A • x where 





and 



1 1 . Express 

dx\ dx2 

~3T + ^l + —tt + 3a; 2 = 0, 
dt dt 

dxi dx? 

in the form dx/dt = A ■ x and solve. Plot the some solution trajectories in x\, X2 phase plane and as 
well as the vector field defined by the system of equations. 
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12. Classify the critical points of 



dx 

~db 

'111 
dt 



y -3, 



y-x 2 + 1, 



and analyze their stability. Plot the global (x, y) phase plane including critical points and vector 
fields. 

13. The following equations arise in a natural circulation loop problem 

dx 

— = y — x, 
dt y 

dy 

— = a-zx, 
dt 

dz 

—- = XV — 0, 

dt y 

where a and b are nonnegative parameters. Find the critical points and analyze their linear stability. 
Find numerically the attractors for (i) a = 2, b = 1, (ii) a = 0.95, 6 = 1, and (iii) a = 0, b = 1. 

14. Sketch the steady state bifurcation diagrams of the following equations. Determine and indicate the 
linear stability of each branch. 

dx ( 1 \ . 

A = -{--r)(2x-r), 

f = - x ( {x -2f-(r-l)). 

15. The motion of a freely spinning object in space is given by 

dx 



dt 


= yz, 


dy 
dt 


= —2xz 


dz 




~dt 


= xy, 



where x, y, z represent the angular velocities about the three principal axes. Show that x 2 +y 2 + z 2 is a 
constant. Find the critical points and analyze their linear stability. Check by throwing a non-spherical 
object (a book?) in the air. 

16. A bead moves along a smooth circular wire of radius a which is rotating about a vertical axis with 
constant angular speed uj. Taking gravity and centrifugal forces into account, the motion of the bead 
is given by 

d 2 9 
a—7r=—gsin8 + auj cos 6 sin 8, 
dt 2 

where 8 is the angular position of the bead with respect to the downward vertical position. Find the 
equilibrium positions and their stability as the parameter /i = auJ 2 / g is varied. 

17. Find a Lyapunov function of the form V = ax 2 + by 2 to investigate the global stability of the critical 
point x = y = of the system of equations 

Lit/.' o t-\ 

— = -2x 3 + 3xy 2 , 
dt 

d y 2 3 

— = -x y-y . 

dt 
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18. Let 



Solve the equation 

dx 

— - = A • x. 

dt 

Determine the critical points and their stability. 

19. Draw the bifurcation diagram of 

| = {x 2 _ 2) 2 _ 2{x 2 + 1)(r _ 1) + (r _ 1)2) 

where r is the bifurcation parameter, indicating the stability of each branch. 

20. Show that for all initial conditions the solutions of 



dx 

~d7 

dl 
dz 

~dt 



2 2 

-x + x y-y , 



-x + xy — Qz, 



-ji = %> 



tend to x = y = z = as t—> oo. 
21. Draw the bifurcation diagram of 



dx 

~di 



x 3 + x ((r - 2) 3 - 1) 



where r is the bifurcation parameter, indicating stable and unstable branches. 
22. Solve the system of equations dx./ dt = A • x where 

/ -3 2 \ 

0-200 

A ~ ooii- 

\ 0/ 



23. Find a Lyapunov function for the system 



dx 

~di 

'III 
dl 



-x-2y 2 

xy -y 3 . 



24. Analyze the local stability of the origin in the following system 



dx 

~dt 
dy 

dt 

— = z + x 2 + y 3 . 
dt 



-2x + y + 3z + 82A 



-6y -5z + 2z 6 , 
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25. Show that the origin is linearly stable 

-£ = {x-by)(x 2 +y 2 -l), 
at 

^ = (ax + y){x 2 + y 2 -l), 

where a, b > 0. Show also that the origin is stable to large perturbations, as long as they satisfy 
x 2 +y 2 < 1. 

26. Draw the bifurcation diagram and analyze the stability of 

dx , , 1 

— = — xix — r — 1) , 

dt y ' 10 

where r is the bifurcation parameter. 

27. Find the dynamical system corresponding to the Hamiltonian H(x, y) = x 2 + Ixy + y 2 and then solve 
it. 

28. Show that solutions of the system of differential equations 

- = -x + y*-z\ 

t = =-y+* 3 -* 3 > 

eventually approach the origin for all initial conditions. 

29. Find and plot all critical points (x, y) of 

— = (r-l)x- 3xy - x , 
dt 

d V , ,n o 2 3 

— = [r-l)y-M y-y . 

as functions of r. Determine the stability of (x,y) = (0, 0), and of one post-bifurcation branch. 

30. Write in matrix form and solve 

dx 

Tt = V + Z > 

dy 

-dt = z + x > 

dz 

dt = X + V - 

31. Find the critical point (or points) of the Van der Pol equation 

d 2 x , .dx 

- jI ,-a(l-x 2 )- + x = 1 a>0, 

and determine its (or their) stability to small perturbations. For a = 1, plot the dx/dt, x phase plane 
including critical points and vector fields. 
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32. Consider a straight line between x = and x = I. Remove the middle half (i.e. the portion between 
x = I /4 and x = 3Z/4). Repeat the process on the two pieces that are left. Find the dimension of 
what is left after an infinite number of iterations. 

33. Classify the critical points of 

dx 

1 = 2y ~ x2 + 1 > 



and analyze their stability. 
34. Determine if the origin is stable if dx/di = A ■ x, where 




35. Find a Lyapunov function of the form V = ax 2 + by 2 to investigate the global stability of the critical 
point x = y = of the system of equations 

-£ = -2x 3 + 3xy 2 , 
dt 

d V 2 3 

Tt = -*v-v- 

36. Draw a bifurcation diagram for the differential equation 

_ = ( x -3){x 2 -r), 

where r is the bifurcation parameter. Analyze linear stability and indicate stable and unstable 
branches. 



37. 


Solve the following system of differential equations using generalized eigenvectors 




dx 

~dt ~ 


— 5x + 2y + z, 




dy 

dt ~ 


— 5y + 3z, 




dz 
~dt 


-5z. 


38. 


Analyze the linear stability of the critical point 


of 




dx 

~dl 


2y + y 2 , 




dy 
dt 


-r + 2x 2 . 


39. 


Show that the solutions of 






dx 
~dl 


= y-x 3 




dy 
dt 


= -x-y 3 



tend to (0,0) as t — > oo. 
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40. Sketch the bifurcation diagram showing the stable and unstable steady states of 

dx 

— = rx(l — x) — x, 

dt v ' 

where r is the bifurcation parameter. 

41. Show in parameter space the different possible behaviors of 

dx 2 

— = a + x y — 2bx — x, 
dt 

— = bx - x 2 y, 

where a, b > 0. 

42. Show that the Henon-Heiles system 

d 2 x 



dt2 -x - 2xy, 

fy 

dt 2 



-^ = -y + y -x , 



is Hamiltonian. Find the Hamiltonian of the system, and determine the stability of the critical point 
at the origin. 

43. Solve dx/ 'dt = A ■ x where 

A ~ \ 2 

using the exponential matrix. 

44. Sketch the steady state bifurcation diagrams of 

^ = (x-r)(x + r)((x - 3) 2 + (r - l) 2 - 1), 

where r is the bifurcation parameter. Determine the linear stability of each branch; indicate the stable 
and unstable ones differently on the diagram. 



45. Classify the critical point of 

f_x 
dl 



37j + (r - r )x = 0. 



46. Show that x = is a stable critical point of the differential equation 

n=0 

where a n > 0, n = 0, 1, • • • , N. 

47. Find the stability of the critical points of the Duffing equation 

d 2 x dx , 

—r = a— - bx + x° = 0, 
dt z dt 

for positive and negative values of a and b. Sketch the flow lines. 
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48. Find a Lyapunov function to investigate the critical point x = y = of the system of equations 

-2x 3 + 3xy 2 , 



dx 

~dl 

'111 
dl 



2 3 

-x y-y . 



49. The populations x and y of two competing animal species are governed by 

dx 

—- = x — 2xy. 

dt J ' 

dy 

What are the steady-state populations? Is the situation stable? 

50. For the Lorenz equations with b = 8/3, r = 28, and initial conditions x(0) = 2, y(0) = 1, 2(0) = 3, 
numerically integrate the Lorenz equations for two cases, a = 1, a = 10. For each case plot the 
trajectory in (x,y,z) phase space and plot x(t),y(t),z(t) for t s [0,50]. Change the initial condition 
on x to x(0) = 2.002 and plot the difference in the predictions of x versus time for both values of a. 

51. Use the Poincare sphere to find all critical points, finite and infinite of the system 

dx 

— = 2x — 2irw, 
dt y ' 

d V o 2,2 

— = 2y - x + y . 
dt 

Plot families of trajectories in the x, y phase space and the X, Y projection of the Poincare sphere. 

52. For the Lorenz equations with a = 10, b = 8/3, and initial conditions x(0) = 0, y(0) = 1, z(0) = 0, 
numerically integrate the Lorenz equations for three cases, r = 10, r = 24, and r = 28. For each case, 
plot the trajectory in (x,y,z) phase space and plot x(t), y(t), z(t) for t £ [0,50]. Change the initial 
condition on x to x(0) = 0.002 and plot the difference in the predictions of x versus time for all three 
values of r. 
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Appendix 



The material in this section is not covered in detail; some is review from undergraduate 
classes. 



10.1 Taylor series 

The Taylor series of y(x) about the point x = x is 



y(x) = y(x ) + 
1 d n y 



dy_ 
dx 



yJL -bo) \ 



1 d 2 y 



n\ dx n 



2 dx 2 

yJL -bo) \ ' ' ' 



yX> *bo) ~ r~ 



1 d 3 y 



6 dx 3 



yJu 'bo) ~ \ • • • 



(10.1) 



I 

Example 10.1 

For a Taylor series of y(x) about x = if 



(1 + a;)" ' 



(10.2) 



Direct substitution reveals that the answer is 

/ s (-n)f-n-l) 2 (-n)(-n- l)(-n- 2) , 
y(x) = l-nx+ J -\r. V + ^ '-x 3 



2! 



3! 



(10.3) 
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10.2 Trigonometric relations 



sin x sin y = - cos(x — y) — - cos(x + y) , (10.4 

sin x cos y = -sin(x + y)-\ — sin(x — y), (10.5 

cos x cos y = - cos(x — y) + - cos(x + y) , (10.6 

2 1 1 

sin x = cos 2a;, (10.7 

2 2 v 

sin x cos x = -sin 2a;, (10.8 

(10.9 

o 3 1 

sin x = -shirr sin 3a;, (10.10 

4 4' v 

sin 2 x cos x = -cos a; cos 3a;, (10.11 

4 4 



cos 2 X = 


= — 1 — cos 2a;, 




2 2 



2 



1 1 



sin x cos x = -sinaH — sin3x, (10.12 

4 4 

3 1 

cos 3 x = -cosxH — cos3x, (10.13 

4 4' v 

, 3 1 1 

sin x = cos2xH — cos4x, (10.14 

8 2 8 v 

sin 3 x cos x = -sin2x sin4x, (10.15 

4 8 v 

1 1 
sin 2 x cos 2 x = cos4x, (10.16 

8 8 v 

sin x cos 3 x = -sin2xH — sin4x, (10.17 

4 8 

3 1 1 

cos 4 a; = — I — cos2xH — cos 4a;, (10.18 

8 2 8 V 

5 5 1 

sin 5 a: = -sin a; sin 3a; -\ sin 5a;, (10.19 

8 16 16 

4 1 3 1 

sin a; cos a; = -cos a; cos 3a; -\ cos 5a;, (10.20 

8 16 16 v 

, 2 1 1 1 

sin a; cos x = -smaH sin 3a; sin 5a;, (10.21 

8 16 16 v 

sin 2 a; cos 3 a; = — cosx cos 3a; cos 5a;, (10.22 

8 16 16 ' v 
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sin x cos x 



cos 5 a; 



1 3 1 

- sin x -\ sin Sx -\ sin ox, 

8 16 16 

5 So 1 . 

- cos x H cos Sx H cos ox. 

8 16 16 



;i0.23) 
;i0.24) 



10.3 Hyperbolic functions 

The hyperbolic functions are defined as follows: 

sinh 6 = 
cosh 6 ■■ 



2 

e e + e~ 



(10.25) 
(10.26) 

10.4 Routh-Hurwitz criterion 

Here we consider the Routh-Hurwitzn criterion. The polynomial equation 

a s n + ais n_1 + . . . + a n _is + a n = 0, (10.27) 

has roots with negative real parts if and only if the following conditions are satisfied: 

• ai/ao, a 2 /a , • • • , a n /a > 0, 

• Di > 0, i = 1, . . . ,n. 

The Hurwitz determinants Di are defined by 

Di oi, (10.28) 

(10.29) 



D, 



D r 



Oi, 






ai 


a 3 




a 


a 2 


3 


ai 


«3 


G5 


a 


a 2 


«i 





«i 


«3 


ai 


a 3 


a 5 


a 


a 2 


«i 





Ol 


(t-:\ 





ao 


(>2 












(10.30) 



a 2n-l ; 
a 2n-2? 
fl2n-3j 
^2n-4, 



(10.31) 



with ai = 0, if i > n. 



1 Edward John Routh, 1831-1907, Canadian-born English mathematician, and Adolf Hurwitz, 1859-1919, 
German mathematician. 
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10.5 Infinite series 

Definition: A power series is of the form 

oo 

^2a n (x - a) n . 
The series converges if \x — a\ < R, where R is the radius of convergence. 



(10.32) 



Definition: A function f(x) is said to be analytic at x = a if / and all its derivatives exist 
at this point. 

An analytic function can be expanded in a Taylor series: 



f{x) = /(a) + f'(a)(x - a) + -f"(a)(x - a) 2 + 



10.33) 



where the function and its derivatives on the right side are evaluated at x = a. This is a 
power series for f(x). We have used primes to indicate derivatives. 



1 

Example 10.2 




Expand (1 + x) n about x = 0. 




m 


= (i+*r 


/(0) 


= i, 


/'(0) 


= n, 


/"(O) 


= n(n — 1) 



(i + x y 



1 + nx H — n(n — l)x 2 



(10.34) 
(10.35) 
(10.36) 
(10.37) 

(10.38) 
(10.39) 



A function of two variables f(x, y) can be similarly expanded 



/(*,!/) = / 



a,b 8x 
Id'f 



[x — a 



2dx 2 
Id 2 / 



a.b 



a,b 

(x — a) 1 + 



dy 

d 2 f 



dxdy 



a,b 



a.b 



(y-b) 



(x-a)(y-b) + 



2dy* 



a,b 



(y - bf 



if / and all its partial derivatives exist and are evaluated at x = a, y = b. 



(10.40) 
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10.6 Asymptotic expansions 

Definition: Consider two function f(x) and g(x). We write that 
f(x) ~ g(x), if lim^a {gf = 1. 



9W 



/(x) = o(flf(x)), if lim a ._ Ml §fy = 0; 
/(x) = 0(#(x)), if lvca. x ^a 






constant; 



ffO*0 

10.7 Special functions 

10.7.1 Gamma function 

The Gamma function may be thought of as an extension to the factorial function. Recall the 
factorial function requires an integer argument. The Gamma function admits real arguments; 
when the argument of the Gamma function is an integer, one finds it is directly related to 
the factorial function. The Gamma function is defined by 



COO 

-t,x-l 



F(x) = / e~H x - x dt. (10.41) 

Jo 
Generally, we are interested in x > 0, but results are available for all x. Some properties are: 

i. r(i) = i. 

2. F(x) = (x- l)r(x-l), x > 1. 

3. r(x) = (x - l)(x - 2) • • • (x - r)T(x - r), x > r. 

4. T(n) = (n — 1)!, where n is a positive integer. 



5. T(x) ~ ^fx x e- x (1 + ^ + ^ + . . .), (Stirling's formula). 

Bender and Orszag show that Stirling'd_| formula is a divergent series. It is an asymptotic 
series, but as more terms are added, the solution can actually get worse. The Gamma 
function and its amplitude are plotted in Fig. 110.11 



10.7.2 Beta function 






The beta function is defined by 






B(p,q)= [ 

Jo 


x p -\l-x) q - 


_1 dx. 


Property: 

B(p,q) 


r(p)r( g ) 





(10.42) 



(10.43) 

2 James Stirling, 1692-1770, Scottish mathematician and member of a prominent Jacobite family. 
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|r(x)| 



10° 
10 4 
10° 
1CT 4 
10-" 




Figure 10.1: Gamma function and amplitude of Gamma function. 



10.7.3 Riemann zeta function 



This is denned as 



c(*) = E 



n 



(10.44) 



n=i 



The function can be evaluated in closed form for even integer values of x. It can be shown 
that C(2) = tt 2 /6, C(4) = vr 4 /90, C(6) = vr 6 /945, . . ., C(2n) = (-l) n + l B 2n (2ir) 2n /2/(2n)l, 
where B 2n is a so-called Bernoulli number, which can be found via a complicated recursion 
formula. All negative even integer values of x give £(x) = 0. Further lmx^oo £(x) = 1. For 
large negative values of x, the Riemann zeta function oscillates with increasing amplitude. 
Plots of the Riemann zeta function for x G [—1,3] and the amplitude of the Riemann zeta 
function over a broader domain on a logarithmic scale as shown in Fig. 110.21 



S(x) 



3 X 




10 x 



Figure 10.2: Riemann zeta function and amplitude of Riemann zeta function. 
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10.7.4 Error functions 

The error function is defined by 

2 r _ f 2 

erf (x) = —= / e 4 d£, 

and the complementary error function by 

erfc (x) = 1 — erf x. 
The error function and the error function complement are plotted in Fig. 110.31 

erf (x) erfc (x) 

i 



0.5 



-4 -2 





-4 -2 



10.45) 



10.46) 



Figure 10.3: Error function and error function complement. 
The imaginary error function is defined by 

erfi (z) = —i erf (iz), 



(10.47) 



where z G C 1 . For real arguments, x G M. 1 , it can be shown that erfi (x) = —i erf (ix) G M 1 . 
The imaginary error function is plotted in Fig. 110.41 for a real argument, x G IR 1 . 

erfi(x) 



2 x 



Figure 10.4: Imaginary error function, erfi (x), for real argument, x G M 1 . 
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10.7.5 Fresnel integrals 

The Fresnejj integrals are denned by 

G(x) 
S(x) 



7Tt 2 

cos — dt, 

2 

ITt 2 , 

sin — at. 



The Fresnel cosine and sine integrals are plotted in Fig. 110.51 




S(x) 



0.4 

0.2 



2.5 5 7.5 




x -7.5 -5 -2.5 




2.5 5 7.5 



Figure 10.5: Fresnel cosine, C(x), and sine, S(x), integrals. 

10.7.6 Sine-, cosine-, and exponential-integral functions 

The sine-integral function is defined by 

sin£ 



Si(x) 
and the cosine-integral function by 

Ci(ar) 



cos£ 



d£, 



d£. 



(10.48) 
(10.49) 



(10.50) 



(10.51) 



The sine integral function is real valued for x G (—00,00). The cosine integral function is 
real valued for x G [0, 00). We also have lim !C _ > o + Ci(x) — > —00. The cosine integral takes on 
a value of zero at discrete positive real values, and has an amplitude which slowly decays as 
x — > 00. The sine integral and cosine integral functions are plotted in Fig. 110.61 
The exponential-integral function is defined by 

Ei(x) = -f- - T di = l -dC (10.52) 

The exponential integral function is plotted in Fig. 110.71 Note we must use the Cauchy 
principal value of the integral if x > 0. 



3 Augustin-Jean Fresnel, 1788-1827, French physicist noted for work in optics. 
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Si(x) 



Ci(x) 





Figure 10.6: Sine integral function, Si(x), and cosine integral function Ci(x) 




Figure 10.7: Exponential integral function, Ei(x). 



10.7.7 Elliptic integrals 

The Legendre elliptic integral of the first kind is 

F(y,k) 



di] 



/o y/{l-v 2 ){l-k 2 v 2 )' 
Another common way of writing the elliptic integral is to take r\ = sin (/), so that 

dcf) 



(10.53) 



F{<j>,k) 



o V(l-fc 2 sin a 



(10.54) 



The Legendre elliptic integral of the second kind is 



o V(T^f) 



(10.55) 



which, on again using r] = sin (f), becomes 



E(<f> 



k)= I v/l - k 2 sin 2 4> d<f> 
Jo 



(10.56) 
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The Legendre elliptic integral of the third kind is 

My,n,k)= [ d< f =, (10.57) 



n sin 2 (/>) \/(l — k 2 sn 



which is equivalent to 



U{(f),n,k)= Jl -k 2 sin 2 (f) d(f). (10.58) 

./o 



For (/> = 7r/2, we have the complete elliptic integrals 

-tt/2 



F 'i' fc ) = X TfriCT (10 ' 59) 



-tt/2 



(10.60) 



E(-,k\ = I \Jl - k 2 sin 2 <j> d<j>, 
u(-,n,k) = / ^Jl-k 2 sm 2 (f) d<j>. (10.61) 

10.7.8 Hypergeometric functions 

A generalized hypergeometric function is defined by 

F I r t r, , -, \ V^ (0l)fe(02)fe • • • (a p ) k x /-ineoN 

p F q ({ai, . . . , a p } , {6i, . . . , 6J ; x = ^ 77 w , ; TT^TJ' 10.62 

where the rising factorial notation, (s)k, is defined by 

, s T(s + k) , 

r(s) 

There are many special hypergeometric functions. If p = 2 and g = 1, we have Gauss's hy- 
pergeometric function 2 F\ ({^i? a 2 } , {61} ; x). Since there are only three parameters, Gauss's 
hypergeometric function is sometimes denoted as 2-P1 (a,b,c,x). An integral representation 
of Gauss's hypergeometric function is 

2 F, (a, b, c, x) = ^( C ) f t b -\l - tr b -\l - tx)~ a dt. (10.64) 

r(6)r(c - b) j 

For special values of parameters, hypergeometric functions can reduce to other functions 
such as tanh -1 . 
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10.7.9 Airy functions 

The Airy functions Ai (x), and Bi (x) are most compactly denned as the two linearly in- 
dependent solutions to the second order differential equation y" — xy = 0, yielding y = 
C]Ai (x) + C^Bi (x). They can be expressed in a variety of other forms. In terms of the 
so-called confluent hypergeometric limit function o-Fi, we have 



Ai (x) 



Bi (x) 



3W (|) 

1 



F i [{} ; < 



9 
3 J ' 9 



^ ({}; {f }; ^) - 3^0^ ({};g};^), (10.65) 

■^moFi({};(|);^ 3 V (10.66) 



3 ; f ' 



The Airy functions are plotted in Fig. 110.81 In integral form, the Airy functions are, for 



Ai(x),Bi(x) 
15 1 'Bi(x) 




Figure 10.8: Airy functions Ai (x) and Bi (x). 



x e 



Ai (x) = - cos ( -t 3 + xt ) dt, 
Bi (x) = - / ( exp ( — t 3 + xt ) + sin ( -t 3 + xt ) ) dt. 



(10.67) 
(10.68) 



10.7.10 Dirac 5 distribution and Heaviside function 

Definition: The Diraqj 8 -distribution (or generalized function, or simply function), is defined 
by 

( if o g [a,/?], 

f(x)0(X — a)dX = < r/ \ -r n\ 

J K J K i 1 /(a) if a G «,/? • 



10.69) 



4 Paul Adrien Maurice Dirac, 1902-1984, English physicist. 
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From this it follows that 



8{x-a) = Oifx^a, (10.70) 

5(x-a)dx = 1. (10.71) 



The 5-distribution may be imagined in a limiting fashion as 



8(x - a) = lim A e (x - a), (10.72) 



where A e (x — a) has one of the following forms: 
1. 



if x < a — |, 



A e (x-a) = { \ ifo-f<x<o + f, (10.73) 



if x > a + |, 



2. 



3. 



A - (I - a) = ^-.fi^ (ia74) 



1 



A e (x -a) = -—e-v- a >l\ (10.75) 

V7re 



The derivative of the function 



if x < a — |, 
fc(x-a)==4 i(x-a) + | ifa-|<x<a + |, (10.76) 

1 if x > a + |, 

is A e (x — a) in Eq. (110.731) . If we define the Heavisidefl function, H(x — a), as 

H(x - a) = lim h(x - a), (10.77) 

e— >0+ 

then 

4-H(x-a) = S(x-a). (10.78) 

ax 

The generator of the Dirac function A € (x — a) and the generator of the Heaviside function 
h(x — a) are plotted for a = and e = 1/5 in Fig. 110.91 As e — > 0, A e has its width decrease 
and its height increase in such a fashion that its area remains constant; simultaneously h 
has its slope steepen in the region where it jumps from zero to unity as e — » 0. 



'Oliver Heaviside, 1850-1925, English mathematician. 
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-1 -0.5 






0.5 


1 



Figure 10.9: Generators of Dirac delta function and Heaviside function, A e (x — a) and 
h(x — a) plotted for a = and e = 1/5. 

10.8 Total derivative 

A function of several variables f(x\,X2, ■ ■ • , x n ) may be differentiated via the total derivative 

df df dx\ df dx 2 df dx n 



dt dxi dt dx 2 dt dx n dt 

Multiplying through by dt, we get the useful formula 

df 



dj = — — ax i + — — dx2 



dxi 



dx". 



dx. 



(JjiAj ffl • 



(10.79) 



(10.80) 



10.9 Leibniz's rule 



Differentiation of an integral is done using Leibniz's rule, Eq. (jl. 293ft : 

•6(x) 



y{x) 



f(x,t)dt, 



a(x) 



(10.81) 



dy(x) d [ b{x] t . , , t . 7/ ^db(x) el . „da(x) f a{x) df(x,t) , „„„„, 
f(x,t)dt = f(x,b(x))—$- L -f(x,a(x))—^- L + \ ' dt. (10.82) 



KAJtAJ KAiiXj 



a(x) 



d.v 



d.v 



b(x) 



dx 



10.10 Complex numbers 

Here we briefly introduce some basic elements of complex variable theory. Recall that the 
imaginary number i is defined such that 



■1. 



(10.83) 
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cost 
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10.10.1 Euler's formula 

We can get a very useful formula Euler's formula, by considering the following Taylor ex- 
pansions of common functions about t = 0: 

1 i i 

(10.84) 

(10.85) 

(10.86) 

With these expansions now consider the following combinations: (cost + isint)\ t=g and 
e \t=ie- 

cos9 + isin9 = l + i9--9 2 -i-9 3 + -9 4 + i-9 5 + ..., (10.87) 

e ie = l + ^ + ^^) 2 + ^^) 3 + ^(^) 4 + ^(^) 5 + -.., (10-88) 

= l+i6 -k 62 ~ l h 6 ' + h 6 ' +l h 6 " + --- (10 - 89) 

As the two series are identical, we have Euler's formula 

e ie = cos 9 + % sin 9. (10.90) 

Powers of complex numbers can be easily obtained using de Moivre 'c| formula: 

e ine = cosn6 + ismn6. (10.91) 

10.10.2 Polar and Cartesian representations 

Now if we take x and y to be real numbers and define the complex number z to be 

z = x + iy, (10.92) 



we can multiply and divide by ^/ x 2 + y 2 to obtain 



z=yfx T +tf\ +* , ■ ( 10 - 93 ) 

\y / x 2 + y 2 y/x 2 + y 2 J 

Noting the similarities between this and the transformation between Cartesian and polar 
coordinates suggests we adopt 

r = ^x 2 + y 2 , cos9 = —j JL , sinfl= —= M- . (10.94) 

\J x 2 + y 2 \Jx 2 + y 2 



e Abraham de Moivre| 1667-1754, French mathematician. 
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Figure 10.10: Polar and Cartesian representation of a complex number z. 



Thus, we have 



z = r (cos 9 + i sin 6) 
z = re ie . 



(10.95) 
(10.96) 



The polar and Cartesian representation of a complex number z is shown in Fig. 110.101 
Now we can define the complex conjugate ~z as 



Note now that 



We also have 



z = x — iy, 



z = vi 2 + y 2 



x . y 

i- 



\fx 2 + y 2 \/x 2 + 



z = r (cos 6* — isinO) , 

z = r (cos(— 6) + isin(— 0)) , 

z = re- w . 



zz = (x + iy)(x — iy) = x +y = \z\ 

if) —i 

= re re 



2 I |2 

r = \z\ . 



sin^ 



cos 9 



e ie 


— e~ 


-i0 


e ie 


2% 

+ e" 


-id 



(10.97) 

(10.98) 

(10.99) 
(10.100) 
(10.101) 



(10.102) 
(10.103) 



(10.104) 
(10.105) 



ICC BY-NC-TW} 29 July 2012, Sen & Powers. 



496 



CHAPTER 10. APPENDIX 



10.10.3 Cauchy-Riemann equations 

Now it is possible to define complex functions of complex variables W(z). For example take 
a complex function to be defined as 



W(z) = z 2 + z, 

= (x + iyf + (x + iy) , 

= x 2 + 2xyi — y 2 + x + iy, 

= {x 2 + x — y 2 ) + i (2xy + y) . 



(10.106) 
(10.107) 
(10.108) 
(10.109) 



In general, we can say 

W(z) = (f)(x, y) + iij)(x, y). (10.110) 

Here cj) and tjj are real functions of real variables. 

Now W(z) is defined as analytic at z if dW/dz exists at z and is independent of the 
direction in which it was calculated. That is, using the definition of the derivative 



dW 



dz 



W(z + Az)-W(z ) 

Az 



(10.111) 



Now there are many paths that we can choose to evaluate the derivative. Let us consider 
two distinct paths, y = C\ and x = C 2 . We will get a result which can be shown to be valid 
for arbitrary paths. For y = C\, we have Az = Ax, so 



dW 



dz 



W{x + iy + Ax)-W(x + iy ) dW 



Ax 



dx 



For x = C*2, we have Az = iAy, so 



dW 



dz 



W{x + iy + iAy) - W{x + iy ) 1 dW 



iAy 



i dy 



dW 



Oil 



Now for an analytic function, we need 

dW 



dx 



dW 



dy 



or, expanding, we need 





dcj) ,difj 
dx dx 


\dy 


dy) ' 








d'ip .dcf) 
dy dy 






For 


equality, and thus path independence 


of the derivai 


Ave, we 


require 




dcf) dtf) 


d(f) 


dtf) 






dx dy ' 


dy 


dx 





(10.112) 

(10.113) 

(10.114) 

(10.115) 
(10.116) 

(10.117) 
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These are the well known Cauchy-Riemann equations for analytic functions of complex 
variables. 

Now most common functions are easily shown to be analytic. For example for the function 



can be 


expressed as W(z) = (x 2 + x - 


- y 2 ) + i(2xy + y), we have 


(j)(x,y) 
dcj) 
dx 


= x 2 + x-y 2 , ip(x,y) = 

dib 
= 2x+l, ^ = 2?/, 
ox 


~-2xy + y, (10.118) 

(10.119) 


dcf) 
dy 


dib 
= -2y, -?- = 2z+l. 
dy 


(10.120) 



Note that the Cauchy-Riemann equations are satisfied since d(j)/dx = dib/dy and dcj)/dy 
—dib/dx. So the derivative is independent of direction, and we can say 



dW dW 



dz dx 



(2x + 1) + i(2y) = 2{x + iy) + l = 2z+l. (10.121) 



We could get this result by ordinary rules of derivatives for real functions. 
For an example of a non-analytic function consider W(z) = z. Thus 

W{z)=x-iy. (10.122) 

So (j) = x and it = —y, d(j)/dx = 1, d(j)/dy = 0, and dib/dx = 0, dib/dy = — 1. Since 
d(f)/dx 7^ difi/dy, the Cauchy-Riemann equations are not satisfied, and the derivative depends 
on direction. 



Problems 

1. Find the limit as x — ► of 



4 cosh x + sinh(arctan In cos 2x) — 4 



e x + arcsin x — vl 
2. Find d<f>/dx in two different ways, where 



<t> 



r 4 

/ Xy/ydy. 

Jx 2 



3. Determine 

(a) \/l, 

(b) i l <fi. 

4. Write three terms of a Taylor series expansion for the function f(x) = exp(tancc) about the point 
x = 7r/4. For what range of x is the series convergent? 

5. Find all complex numbers z = x + iy such that \z + 2i| = |l + i|. 

6. Determine linin^oo z„ for z n = ^ + ((n + l)/(n + 2))i. 
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7. A particle is constrained to a path which is defined by the function s(x,y) = x 2 + y — 5 = 0. The 
velocity component in the x-direction, dx/dt = 2y. What are the position and velocity components 
in the y-direction when x = 4. 

8. The error function is defined as erf [x] = —/= L e~ u du. Determine its derivative with respect to x. 

9. Verify that 

f w sinna; 
lim / dx = 0. 

n^oo J ff nx 

10. Write a Taylor series expansion for the function f(x,y) = x 2 cosy about the point x = 2, y = n. 
Include the x 2 , y 2 and xy terms. 



11. Show that 



e cos 2tx dt, 



satisfies the differential equation 

dd> 

-f- + 26x = 0. 

dx 

12. Evaluate the Dirichlet discontinuous integral 

1 f°° sin ax 
I = — dx, 

K J-oo X 

for a G (— oo,oo). You can use the results of example 3.11, Greenberg. 

13. Defining 

x — y 

u{x,y) = -£- — -, 
x z + y z 

except at x = y = 0, where u = 0, show that u x (x, y) exists at x = y = but is not continuous there. 

14. Using complex numbers show that 

(a) cos 3 x = j (cos 3a; + 3 cos x) , 

(b) sin 3 x = j (3 sin x — sin 3a;). 
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